Performance Tuning

  • I have a Table called Employees with following Columns

    EmployeeID INT

    EmployeeName VARCHAR(256)

    Salary MONEY

    DeptID SMALLINT

    The table has following Indexes created on it

    1. Clustered Index on EmployeeID

    2. Non Clustered Index on EmployeeName

    3. Non Clustered Index on DeptID

    The Employees tables has over 10 million rows.

    You want to get the count of Employees, for which you write the following query

    SELECT COUNT(*) FROM Employees

    What will be the likely query plan for above query? Which index will SQL SERVER use in above query? Please give an explanation for your answer.

  • Looks and sounds like an interview question or homework to me. Which one do you think?



    MCSE: Data Platform
    MCSE: Business Intelligence
    Follow me on Twitter: @WazzTheBadger
    LinkedIn Profile: Simon Osborne

  • Why don't you run the query and see which index and execution plan it will use? If you would like help understanding why SQL picked the specific plan that it did, feel free to ask.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks For The Reply Monster ,

    i am new to Sql server.Running The Query

    on millions of records is completly different from running the query on small datasets as i think.Hence i want to run the query on millions of records and

    know which indexes are getting used.

  • Recently I've been doing lots of work on large data sets, wide tables (20+ columns) with mixture of Unicode and non-Unicode data types and IO becomes a real issue when dealing with large tables. With this experience plus the great advice I've read about on this site I've come to realise the importance of IO and how it plays a big part in how SQL server ultimately does what you ask of it.

    To answer your question, SQL would likely choose the index with the smallest storage requirement so that would be the DeptID as its TINYINT so requires fewer pages to store so less IO. In other words, it's probably the most efficient way to use the index with the smalles storage requirement to retrun the table count.

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • Thank You Dina,

    Good Explanation

  • Ananth@Sql (4/18/2013)


    Running The Query

    on millions of records is completly different from running the query on small datasets as i think.Hence i want to run the query on millions of records and know which indexes are getting used.

    So create a table, populate it with a few million rows, run the query and see how it behaves.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • To illustrate this I did this simple test where I did a count(*) on a table (34 million records) with just a clustsred index (picture 1), I then created a non clustsred index on a BIGINT column and did another count (Picture 2)

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • You can apply logic to what you think the execution plan will look like, but in my experience trying to second guess the Optimizer is notoriously difficult, as my next post will show.

    i.e. taking into account indexes, statistics and existing cached plans and the query construction itself, the plan may not be what logic would dictate.

  • Thank you Dina

    Your Explanation And presentation Makes Me Clear Of This Question.

    And for a clarification tell Me The datatype Of Clustered Index Column.

  • Just to illustrate what Gail was saying. . .

    You could easily test this out yourself.

    USE tempdb;

    IF object_id('dbo.Employees') IS NOT NULL

    BEGIN

    DROP TABLE dbo.Employees;

    END;

    --10,000,000 Random rows of data

    SELECT TOP 10000000 IDENTITY(INT,1,1) AS EmployeeID,

    CAST(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(

    REPLACE(CHAR((ABS(CHECKSUM(NEWID())) % 25) + 65)+LOWER(REPLACE(

    CAST(NEWID() AS VARCHAR(36)),'-','')) COLLATE Latin1_General_BIN2,'0',''),'1',''),'2',''),'3',''),

    '4',''),'5',''),'6',''),'7',''),'8',''),'9','') AS VARCHAR(255)) AS EmployeeName,

    CONVERT(MONEY,CONVERT(VARCHAR(6),CONVERT(MONEY,RAND(CHECKSUM(NEWID())) * 100),0)) AS Salary,

    CAST((ABS(CHECKSUM(NEWID())) % 100) + 1 AS SMALLINT) AS DeptID

    INTO dbo.Employees

    FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3;

    CREATE CLUSTERED INDEX Employees_cl_EmployeeID ON dbo.Employees (EmployeeID);

    CREATE NONCLUSTERED INDEX Employees_nc_EmployeeName ON dbo.Employees (EmployeeName);

    CREATE NONCLUSTERED INDEX Employees_nc_DeptID ON dbo.Employees (DeptID);

    So the above will produce an "Employees" table with 10,000,000 rows in it and the indexes you stated.


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • It came down to the number of pages used by the index (fewer pages = less IO)

    The top row is the clustsred index, the bottom row is the non clustered index I created recently. SQL ended up using the index with the fewest number of pages.

    Try it out as the monster had suggested 😀

    Good luck!

    p.s.

    Before anyone starts bashing me for having so many indexes on a table, YES, these are necessary and you dont' know what I'm doing and what the table is for... lol 😛

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • Ananth@Sql (4/18/2013)


    Thank you Dina

    Your Explanation And presentation Makes Me Clear Of This Question.

    And for a clarification tell Me The datatype Of Clustered Index Column.

    Not necessarily relevant as the clustered index at the leaf level is the table. This means there could be as few as 1 record per page depending on how wide the row is and SQL Server is doing a clustered index scan. This means it is scan the table using a lot of IO.

  • Lynn Pettis (4/18/2013)


    Ananth@Sql (4/18/2013)


    Thank you Dina

    Your Explanation And presentation Makes Me Clear Of This Question.

    And for a clarification tell Me The datatype Of Clustered Index Column.

    Not necessarily relevant as the clustered index at the leaf level is the table. This means there could be as few as 1 record per page depending on how wide the row is and SQL Server is doing a clustered index scan. This means it is scan the table using a lot of IO.

    Yes, I forgot to mention that my clustsred index is on an ID column that's an INT but as you say, the leaf pages of a clustsred index and the data pages of the table the index is on is one of the same so if it's a wide table it will need more pages than a non-clustsred index on a BIGINT column as in my example.

    Apologies to the OP for not clarifying this eariler.

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply