Saturday 18 January 2014

What is Fill Factor ?

                                                       Fill factor is the value that determines the percentage of space on each leaf-level page to be filled with data. In an SQL Server, the smallest unit is a page, which is made of  Page with size 8K. Every page can store one or more rows based on the size of the row. The default value of the Fill Factor is 100, which is same as value 0. The default Fill Factor (100 or 0) will allow the SQL Server to fill the leaf-level pages of an index with the maximum numbers of the rows it can fit. There will be no or very little empty space left in the page, when the fill factor is 100. 
                                                    If the page is completely filled and new data is inserted in the table which belongs to completely filled page, the “page split” event happens to accommodate new data. When new data arrives, SQL Server has to accommodate the new data, and if it belongs to the page which is completely filled, SQL Server splits the page in two pages dividing the data in half. This means a completely filled page is now two half-filled pages. Once this page split process is over, the new data is inserted at its logical place. This page split process is expensive in terms of the resources. As there is a lock on the page for a brief period as well, more storage space is used to accommodate small amounts of data. The argument usually is that it does not matter as storage is cheaper now a day. Let us talk about Fill Factor in terms of IO in the next paragraph.
                                  
                It is absolutely true that storage is getting cheaper every day.      .Now let us assume that there is Table 1 which contains the data worth of 1000 pages. All these pages are 100% filled. If we run a query to retrieve all the data from the SQL Server, it should retrieve all the 1000 pages. If pages are only 50% filled to accommodate Table 1, we will need 2000 pages, which means SQL Server has to read twice the amount of the data from the disk, leading to higher usage of memory, CPU and IO bandwidth.
                                Reading previous paragraph gives us an impression that having Fill Factor 100 is the best option as there won’t be any empty space in the page and IO will be always optimal. Again, this is true if we never do any insert, update or delete in the table. If we insert new data, ‘Page Split’ will happen and there will be few pages which will have 50% Fill Factor now. An OLTP system which is continuously modified ‑ this has always been a challenge with regard to having the right Fill Factor. Additionally, note that Fill Factor only applies to a scenario when the index was originally created or index was rebuilt; it does not apply subsequently when page split happens. As discussed earlier, page split increases IO as well storage space.
                         So,higher Fill Factor and high transaction server implies higher page split. However, Fill Factor can help us to reduce the number of the page splits as the new data will be accommodated in the page right away without much difficulty and further page splits. We can measure the page split by the watching performance monitor counter “SQLServer:AccessMethods:Page Splits/Sec”.
 Additionally, you can measure the same using the following sys query.
SELECT cntr_value
FROM MASTER.dbo.sysperfinfo
WHERE counter_name ='Page Splits/sec' AND
OBJECT_NAME LIKE'%Access methods%'

Fill factor is usually measured at the server level as well as table level. Below we have the scripts for the same.
Here is the script to measure the Fill Factor at the server level:
SELECT *
FROM sys.configurations
WHERE name ='fill factor (%)'
And, here is the script to measure the Fill Factor at the table/index level:
USE YourDatabaseName;
SELECT OBJECT_NAME(OBJECT_ID) Name, type_desc, fill_factor
FROM sys.indexes

Friday 17 January 2014

How Does SQL Server Store Data?

                        This is one of the important topic for database developer.We all know data is stored in table in sql server.All of us are interested to know  little more about it.How Sql Server stores data into table.I will try to explain in simple word.
                      Microsoft SQL Server databases are stored on disk in two files: a data file and a log file.
Data file named as .MDF (Master data file) and .LDF (Log data file).Next thing ,What kind of data is stored in .MDF and .LDF file.

Whats .MDF file stores :
                                     The SQL MDF file contains tables, stored procedures and user accounts. The MDF file is "attached" to the database. The MDF file is automatically created when the administrator creates a new database. The file continues to grow each time a user creates a new record in the tables. The user tables are the main storage objects in the MDF file. These objects can hold millions of records for the company.

                                      
MDF files can grow to several megabytes. Databases with millions of rows can even grow into gigabytes of information. For this reason, database administrators who maintain large, enterprise servers should ensure that the hard drive has enough storage space to hold the growing MDF file. This is especially important when the hard drive houses several databases for the company. With each MDF file growing to several gigabytes, hard drive space is quickly consumed.The database administrator can use the MDF file as a backup. The MDF file is copied to an external media device such as an external hard drive, USB flash drive, CD or DVD. 

Whats .LDF file stores :
                                    LDF is a file extension for a log file used with Microsoft SQL Server. LDF files contain logging information for all transactions completed by the server. LDF files are used to time stamp any transactions to the SQL Server database, allowing the SQL database to be easily recoverable in the case of data loss.

e.g  . 
           I am creating table MyMemory .Now I am going to explain how data will store internally


 Query : 
              Create table MyMemory (ID int identity(1,1),Name varchar(100))

 Insert :
        Insert into MyMemory (Name) values  ('Shrikant')
        Insert into MyMemory (Name) values  ('Sai')
        Insert into MyMemory (Name) values  ('Santosh')
        Insert into MyMemory (Name) values  ('deelip')
Now ,This data will store in table .It will look like 
      
ID     Name
1 Shrikant
2 Sai
3 Santosh
4 deelip

         We are interested to know how it will store internally.We have DBCC command which will help us to know how data is stored internally.

Data Files Are Broken Up Into 8KB Pages 

           These pages are the smallest unit of storage both in memory and on disk.  When we write the very first row into a table, SQL Server allocates an 8KB page to store that row – and maybe a few more rows, depending on the size of our data.  In our Friends example, each of our rows is small, so we can cram a bunch of ‘em onto a page.  If we had bigger rows, they might take up multiple pages even just to store one row.  For example, if you added a VARCHAR(MAX) field and stuffed it with data, it would span multiple pages.
Each page is dedicated to just one table.  If we add several different small tables, they’ll each be stored on their own pages, even if they’re really small tables.
DBCC IND('Shri_Test_17012014', 'MyMemory', -1);
 -->Database name - Shri_Test_17012014
--> Table Name - MyMemory

            Result of this command will tell us how many pages it has occupied for storing data. Above table has occupied 2 pages means 16KB of data.
 
 PagePID        ObjectID       Table Name [object_Name(Object_id)]
 611164   1280827725 MyMemory
 611163   1280827725 MyMemory

How the Data File and Log File are Accessed :

                      Data files, on the other hand, are a jumbled mess of stuff.  You’ve got tables and pages all over the place, and your users are making unpredictable changes all over the place.  SQL Server’s access for data files tends to be random, and it’s a combination of both reads and writes.  The more memory your server has, the less data file reads happen – SQL Server will cache the data pages in memory and just work off that cache rather than reading over and over.  This is why we often suggest stuffing your SQL Server with as much memory as you can afford; it’s cheaper than buying good storage.
                    Log files are written to sequentially, start to finish.  SQL Server doesn’t jump around – it just makes a little to-do list and keeps right on going.  Eventually when it reaches the end of the log file, it’ll either circle back around to the beginning and start again, or it’ll add additional space at the end and keep on writing.  Either way, though, we’re talking about sequential writes.  It’s not that we never read the log file – we do, like when we perform transaction log backups.  However, these are the exception rather than the norm.

First Database In Sql Server