Query Performance

  • Hi,

    I have a problematic query and need your guidance.

    Let's that I have 3 tables A, B and C and all 3 of them have large amount of data(lets say 4 million each).

    In the query I have, A is joined to B with a on clause on a column, B to C on another column.

    The where clause just have a column from table C to meet the business demand and to restrict the number of results.

    Now the problem starts, the execution plan shows that in table A and B, clustered index scan is done. I can create non clustered indexes on these tables but will they work since their columns are not in where clause.

    Will it help me to create indexes for these tables based on 'ON' clause on which they are joined.

    Please let me know what tried and tested theory says. I will try creating index in off-business hours but need some experts opinion here.

    Regards

    Chandan

  • Can you post the query and actual execution plan?

    Also if your goal is to go from an Index scan to a seek , then this is not always possbile or even an improvment

  • No real way to tell you anything without more information. Minimum an execution plan.

    You say you're getting a table scan, do you have clustered indexes on the tables? If not, why not?

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Grant Fritchey (10/5/2011)


    No real way to tell you anything without more information. Minimum an execution plan.

    You say you're getting a table scan, do you have clustered indexes on the tables? If not, why not?

    Sorry Grant and Steve. I was having dinner so could not reply for sometime.

    I am attaching the query and execution plan. The query is a one liner and reads from a view and thus I am attaching the defintion for view as well.Please let me know if I can give some more information.

    Regards

    Chandan

  • chandan_jha18 (10/5/2011)


    Grant Fritchey (10/5/2011)


    No real way to tell you anything without more information. Minimum an execution plan.

    You say you're getting a table scan, do you have clustered indexes on the tables? If not, why not?

    Sorry Grant and Steve. I was having dinner so could not reply for sometime.

    I am attaching the query and execution plan. The query is a one liner and reads from a view and thus I am attaching the defintion for view as well.Please let me know if I can give some more information.

    Regards

    Chandan

    Looking through it a few things stand out. Stuff like this:

    CAST(dbo.WH_INET_VendorCustomerRelation.VendorId AS VARCHAR(6))

    When used in a JOIN or a WHERE clause automatically cause scans. You can't get away with it otherwise. Your table structure joins on columns that are not the same data type? I'd start working there. Fix that. You can't tune this otherwise.

    If you look at the execution plan, you're getting bitmap filtering

    PROBE([Opt_Bitmap1025],[CardData].[dbo].[WH_CarrierData].[Id],N'[IN ROW]')

    It's a mechanism to enhance performance in parallel execution plans when dealing with large data sets. From what I can tell, your hitting so much data all at once that SQL Server is forced to use this approach.

    I'd start with gettng rid of that CAST and then see where the plan takes you. All the scans right now are killing you. You do have a bit of disparity on the statistics too, but that might not be an issue.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Your outer query has the WHERE clause:

    where PostedDate='1/19/2011'

    and PostedDate is defined in the view as:

    CAST(CAST(SUBSTRING(WH_Status.StatusHistory,CHARINDEX('Batch posted successfully',CONVERT(VARCHAR(MAX),WH_Status.StatusHistory))-35,26) AS DATE) AS DATETIME) AS PostedDate

    You will always get a scan on the WH_Status table unless you can filter on an indexed field.

  • paul_ramster (10/5/2011)


    Your outer query has the WHERE clause:

    where PostedDate='1/19/2011'

    and PostedDate is defined in the view as:

    CAST(CAST(SUBSTRING(WH_Status.StatusHistory,CHARINDEX('Batch posted successfully',CONVERT(VARCHAR(MAX),WH_Status.StatusHistory))-35,26) AS DATE) AS DATETIME) AS PostedDate

    You will always get a scan on the WH_Status table unless you can filter on an indexed field.

    Yeah, there are functions all over the place leading to scans. Data types are there for you to use. Not using them will cost you.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Thanks. I was trying to remove the cast functions in the 'on' filter which Grant mentioned sometime ago.

    But as you(Paul and Grant) mentioned about posted date which is defined weirdly inside view, what can be done in such case to avoid a index scan?

    As you can see that posted date looks to et defined dynamically, what can be done to do it more efficiently.

    @Grant, as you can see that posted date is itself calculated by so many logics, how to use a good data type for this as you mentioned.

    Thanks

    Chandan

  • chandan_jha18 (10/5/2011)


    Thanks. I was trying to remove the cast functions in the 'on' filter which Grant mentioned sometime ago.

    But as you(Paul and Grant) mentioned about posted date which is defined weirdly inside view, what can be done in such case to avoid a index scan?

    As you can see that posted date looks to et defined dynamically, what can be done to do it more efficiently.

    @Grant, as you can see that posted date is itself calculated by so many logics, how to use a good data type for this as you mentioned.

    Thanks

    Chandan

    Pull the date time values out during a data load, create a calculated column, move it ahead of time. Having to search the string like that, there is just literally no way, at all, to speed up the query. It will always do a scan. To change it, you have to change the structure in some way.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Grant Fritchey (10/5/2011)


    chandan_jha18 (10/5/2011)


    Thanks. I was trying to remove the cast functions in the 'on' filter which Grant mentioned sometime ago.

    But as you(Paul and Grant) mentioned about posted date which is defined weirdly inside view, what can be done in such case to avoid a index scan?

    As you can see that posted date looks to et defined dynamically, what can be done to do it more efficiently.

    @Grant, as you can see that posted date is itself calculated by so many logics, how to use a good data type for this as you mentioned.

    Thanks

    Chandan

    Pull the date time values out during a data load, create a calculated column, move it ahead of time. Having to search the string like that, there is just literally no way, at all, to speed up the query. It will always do a scan. To change it, you have to change the structure in some way.

    What about a join condition like this:

    ON DAY(t.Tdate) = DAY(p.PRD_ValidDate)

    AND MONTH(t.Tdate) = MONTH(p.PRD_ValidDate)

    AND YEAR(t.Tdate) = YEAR(p.PRD_ValidDate)

    http://sqlvince.blogspot.com/[/url]

  • vince_sql (10/5/2011)


    Grant Fritchey (10/5/2011)


    chandan_jha18 (10/5/2011)


    Thanks. I was trying to remove the cast functions in the 'on' filter which Grant mentioned sometime ago.

    But as you(Paul and Grant) mentioned about posted date which is defined weirdly inside view, what can be done in such case to avoid a index scan?

    As you can see that posted date looks to et defined dynamically, what can be done to do it more efficiently.

    @Grant, as you can see that posted date is itself calculated by so many logics, how to use a good data type for this as you mentioned.

    Thanks

    Chandan

    Pull the date time values out during a data load, create a calculated column, move it ahead of time. Having to search the string like that, there is just literally no way, at all, to speed up the query. It will always do a scan. To change it, you have to change the structure in some way.

    What about a join condition like this:

    ON DAY(t.Tdate) = DAY(p.PRD_ValidDate)

    AND MONTH(t.Tdate) = MONTH(p.PRD_ValidDate)

    AND YEAR(t.Tdate) = YEAR(p.PRD_ValidDate)

    Nope. Calculations on columns result in scans. Period. SQL Server can't know what the values of those calculations are going to be, so it can't use an index.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • I changed the data type of 2 columns to match the data type of other column so that varchar can be compared with varchar but still it didn't make much difference in performance.

    Any other suggestions please.

    Regards

    Chandan

  • I've been watching. I've learned all I know from Grant and I haven't read more than 10% of his stuff (yet). So if he says there's nothing to do unless... then I'd listen 😉

  • Grant Fritchey (10/5/2011)


    chandan_jha18 (10/5/2011)


    Grant Fritchey (10/5/2011)


    No real way to tell you anything without more information. Minimum an execution plan.

    You say you're getting a table scan, do you have clustered indexes on the tables? If not, why not?

    Sorry Grant and Steve. I was having dinner so could not reply for sometime.

    I am attaching the query and execution plan. The query is a one liner and reads from a view and thus I am attaching the defintion for view as well.Please let me know if I can give some more information.

    Regards

    Chandan

    Looking through it a few things stand out. Stuff like this:

    CAST(dbo.WH_INET_VendorCustomerRelation.VendorId AS VARCHAR(6))

    When used in a JOIN or a WHERE clause automatically cause scans. You can't get away with it otherwise. Your table structure joins on columns that are not the same data type? I'd start working there. Fix that. You can't tune this otherwise.

    If you look at the execution plan, you're getting bitmap filtering

    PROBE([Opt_Bitmap1025],[CardData].[dbo].[WH_CarrierData].[Id],N'[IN ROW]')

    It's a mechanism to enhance performance in parallel execution plans when dealing with large data sets. From what I can tell, your hitting so much data all at once that SQL Server is forced to use this approach.

    I'd start with gettng rid of that CAST and then see where the plan takes you. All the scans right now are killing you. You do have a bit of disparity on the statistics too, but that might not be an issue.

    I changed the data type of vendor id and another column to avoid a cast and stored them as varchar but still index scan is forced.

    Any other suggestion?

    Thanks

    Chandan

  • Grant Fritchey (10/5/2011)


    paul_ramster (10/5/2011)


    Your outer query has the WHERE clause:

    where PostedDate='1/19/2011'

    and PostedDate is defined in the view as:

    CAST(CAST(SUBSTRING(WH_Status.StatusHistory,CHARINDEX('Batch posted successfully',CONVERT(VARCHAR(MAX),WH_Status.StatusHistory))-35,26) AS DATE) AS DATETIME) AS PostedDate

    You will always get a scan on the WH_Status table unless you can filter on an indexed field.

    Yeah, there are functions all over the place leading to scans. Data types are there for you to use. Not using them will cost you.

    Can you please explain me what advice you gave me here.

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply