Saturday 13 June 2015

Performance Tuning - Statistics - The heart of Performance.

Last post we have seen how to read excecution plan .It was brief  post about execution plan.you may have question in your mind how this execution plan is created ?On what basis SQL Server engine generate  plan?  how many plan gets generated  and which plan SQL Engine select?
   I will try to make it as simple as .There should be something  in sql server which act as meta data to take decision  .Yes there is .That something is called as Statistics. 
  Now next question can come to in your mind how it gets created and who create it?
   Statistics get created in following cases and Sql server engine automatically creates it.
1.When we create any index .
2. Running query using where  or join clause so individual statistics will be created first time .
 It will be updated later point if auto_update_statistics is ON.

E.g.
CREATE TABLE  CREDITCARD
  (
     [Credit_Card_id] [INT] NOT NULL,
     [Name]           [VARCHAR](100) NULL,
     [Limit]          [INT] NULL,
     [Valid_from]     [INT] NULL,
     [Valid_To]       [INT] NULL,
 
  )

  Go
 select * from CREDITCARD
  Now we have created table and seeing statistics are created are not.
  You will see no statistics are created yet.

Lets run below query ,

SELECT *
  FROM   CREDITCARD
  WHERE  Credit_Card_id = 1
         AND Name = 'test'
         AND Limit = 1
         AND Valid_from = 20
         AND Valid_To = 30
Now if you refresh statistics you will see five statistics object are created .Lets see how to see  
what data is there in that statistics .


We will use DBCC   SHOW_STATISTICS to see data inside statistics. 

Note : For demo I am considering only one statistics object.Name of your statistic will be different as it creates dynamically.

  DBCC SHOW_STATISTICS('CREDITCARD',  _WA_Sys_00000001_0425A276)



Now we are seeing statistics but no data is there.Now lets do some insert.Lets  insert data and run once above command.

                insert into CREDITCARD 
               select 1,'Shrikant',25000,1,22
  
              Go
           SELECT *
                         FROM   CREDITCARD
                        WHERE  Credit_Card_id = 1
                       AND Name = 'test'
                       AND Limit = 1
                      AND Valid_from = 20
                    AND Valid_To = 30 
  
  DBCC SHOW_STATISTICS('CREDITCARD', _WA_Sys_00000001_0425A276).
Now you will see very important information.
  • The number of rows and pages occupied by a table's data
  • The time that statistics were last updated
  • The average length of keys in a column
  • Histograms showing the distribution of data in a column
  • String summaries that are used when performing LIKE queries on character data


In this third row is very important .In this we can see how many distinct values are there for that column .Ther is one column  called EQ_ROW which is very useful for calculating plan.
Lets go bit inside.I am inserting a 10000 record in above table.

  declare @lclId int =1
  while 1= 1
  begin
  if @lclId>10000
  break
  
  insert into CREDITCARD 
  select  @lclId,'Shrikant',25000,1,22
  set @lclId +=1
    end 
Now If you run following select statement and DBBC command
Go
  SELECT *
  FROM   CREDITCARD
  WHERE   
           Name = 'Shrikant'
         AND Limit = 25000
         AND Valid_from = 1
         AND Valid_To = 22  and Credit_Card_id =1
  GO
  DBCC SHOW_STATISTICS('CREDITCARD',_WA_Sys_00000001_0425A276 )
  
you can see vaues for update and EQ_Row.EQ row is used to calculate estimated row count.

Note : You can manually update statistics by using sp_updatestats procedure for entire database.
If you want to update statistics for a table in above case you can use

UPDATE STATISTICS CREDITCARD

If you want to update statistics for a specific statistics in above case you can use

UPDATE STATISTICS CREDITCARD     _WA_Sys_00000001_0425A276;


In next post we will try to understand how statistics are useful for performance tuning.
  

No comments:

Post a Comment

First Database In Sql Server