Mysterious performance issue when joining a view

  • (edit - updated exe plan and added new indices)

    Hi all,

    We have a non-performant view, usual story (worked fine in the past but with growth has ground to a crawl), but unusually, it is one which runs fine by itself:

    Select *

    From viewINR03bNewValues

    (no results)

    We will join it to a table further down. The table is large, but not monstrous;

    Select Chatelet, ModelCode

    From tblInvoiceNotRegisteredHeader

    (370,000 rows)

    In both cases, the result is returned in a couple of seconds.

    Now, the mysterious part is, when we do join the two, the statement will take several hours (at best) to run , mysteriously despite the code having 0 rows to return (am I wrong?):

    SELECT dbo.tblInvoiceNotRegisteredHeader.*

    FROM dbo.tblInvoiceNotRegisteredHeader

    INNER JOIN

    dbo.viewINR03bNewValues

    ON dbo.tblInvoiceNotRegisteredHeader.Chatelet=dbo.viewINR03bNewValues.Chatelet

    AND

    dbo.tblInvoiceNotRegisteredHeader.ModelCode = dbo.viewINR03bNewValues.ModelCode

    GO

    I added some indices on the advice of the tuning Wizard, also created stats.

    Execution plans (Only the new plan attached)

    The old exe plan contained several index scans from 10-20%, total of over 60%, plus a 16% Hash Join and some parallism. These indices had <5% fragmentation.

    The new plan still has 40% total index scans but a lot of new non-clustered seeks. However the promised reduction was only 27%, and runtime is unchanged (still in excess of 3 hours). I will drop these indices.

    Stats are updated nightly.

    This query is one of several levels of of nested views in a sproc.

    Any ideas how to troubleshoot this one?

    Cheers,

    Jake.

    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------

    Materialised view+table....

    SELECT dbo.viewINR01.Chatelet, dbo.viewINR01.ModelCode, dbo.viewINR01.FirstInvoiceDate, dbo.viewINR01.GrossTurnover, dbo.tblCalendar.CalendarId,

    dbo.viewINR01.IdChannel, dbo.viewINR01.DirectCustomerCode, dbo.tblCalendar.YYYY, dbo.tblCalendar.MM

    FROM dbo.tblCalendar INNER JOIN

    dbo.viewINR01 ON dbo.tblCalendar.FromDate = dbo.viewINR01.FirstInvoiceDate AND

    dbo.tblCalendar.Todate = dbo.viewINR01.FirstInvoiceDate LEFT OUTER JOIN

    dbo.tblInvoiceNotRegisteredHeader ON dbo.viewINR01.DirectCustomerCode = dbo.tblInvoiceNotRegisteredHeader.CustomerCode AND

    dbo.viewINR01.GrossTurnover = dbo.tblInvoiceNotRegisteredHeader.GrossTurnover AND

    dbo.viewINR01.ModelCode = dbo.tblInvoiceNotRegisteredHeader.ModelCode AND

    dbo.viewINR01.Chatelet = dbo.tblInvoiceNotRegisteredHeader.Chatelet

    WHERE (dbo.tblInvoiceNotRegisteredHeader.Chatelet IS NULL) AND (dbo.tblCalendar.CalendarisationId = 6)

    And the table...

    CREATE TABLE [dbo].[tblInvoiceNotRegisteredHeader](

    [Chassis] [varchar](8) NOT NULL,

    [ModelCode] [varchar](3) NOT NULL,

    [CustomerCode] [int] NOT NULL,

    [GrossTurnover] [decimal](10, 2) NOT NULL,

    [EstimatedIncentive] [decimal](18, 0) NOT NULL,

    [idCalendar] [char](10) NOT NULL,

    [YYYY] [int] NOT NULL,

    [MM] [int] NOT NULL,

    [IdChannel] [int] NOT NULL,

    [Accrue] [bit] NOT NULL,

    [Registered] [bit] NOT NULL,

    [LogisticCheck] [datetime] NULL,

    [DateCreated] [datetime] NULL,

    [DateLastAmended] [datetime] NULL,

    [DateClosed] [datetime] NULL,

    [UserCreated] [int] NULL,

    [UserLastAmended] [int] NULL,

    CONSTRAINT [PK_tblInvoiceNotRegistered] PRIMARY KEY CLUSTERED

    (

    [Chassis] ASC,

    [ModelCode] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    SET ANSI_PADDING OFF

    GO

    ALTER TABLE [dbo].[tblInvoiceNotRegisteredHeader] ADD CONSTRAINT [DF_tblInvoiceNotRegistered_Accrue] DEFAULT (0) FOR [Accrue]

    GO

    ALTER TABLE [dbo].[tblInvoiceNotRegisteredHeader] ADD CONSTRAINT [DF_tblInvoiceNotRegisteredHeader_Registered] DEFAULT (0) FOR [Registered]

    GO

  • Are the join predicates indexed in the underlying tables?

    What does the execution plan show? Look for scans instead of seeks.

  • This looks like you have views joining to views which are joining to views. Nesting views creates some of the worse execution plans.

    In order to help we need a few things. Mostly we need to see an actual execution plan, the one you attached is the estimated plan. Also, table and index definitions would be a big help.

    Take a look at this article that explains what to post for help with performance issues.

    http://qa.sqlservercentral.com/articles/SQLServerCentral/66909/[/url]

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Quick thought, if the view is returning in 20-30ms, that looks like once per row in the table

    😎

  • Everyone,

    Have clicked 'Actual Execution Plan' and uploading, along with the newly created indices....

  • Implicit conversions in joins:

    [Incentives].[dbo].[tblVehicleTurnoverSummary].[ModelCode]=CONVERT_IMPLICIT(nvarchar(3),[Incentives].[dbo].[tblInvoiceNotRegisteredHeader].[ModelCode],0) AND [Incentives].[dbo].[tblVehicleTurnoverSummary].[Chassis]=CONVERT_IMPLICIT(nvarchar(8),[Incentives].[dbo].[tblInvoiceNotRegisteredHeader].[Chassis],0)

    ? Referenced twice.

    [Sales].[dbo].[tblRegHist].[PK_tblRegHist_1__17] has no clustered index and a RID lookup. Add an appropriate clustered index and a covering index.

    As Sean pointed out, joining views to views confuses the optimiser – and you too.

    Try to incorporate table aliases into your code:

    -- view rewritten as NOT EXISTS:

    SELECT

    v.Chatelet,

    v.ModelCode,

    v.FirstInvoiceDate,

    v.GrossTurnover,

    c.CalendarId,

    v.IdChannel,

    v.DirectCustomerCode,

    c.YYYY,

    c.MM

    FROM dbo.tblCalendar c

    INNER JOIN dbo.viewINR01 v

    ON c.FromDate = v.FirstInvoiceDate

    AND c.Todate = v.FirstInvoiceDate

    WHERE c.CalendarisationId = 6

    AND NOT EXISTS (

    SELECT 1

    FROM dbo.tblInvoiceNotRegisteredHeader i

    WHERE v.DirectCustomerCode = i.CustomerCode

    AND v.GrossTurnover = i.GrossTurnover

    AND v.ModelCode = i.ModelCode

    AND v.Chatelet = i.Chatelet)

    -- query:

    SELECT i.*

    FROM dbo.tblInvoiceNotRegisteredHeader i

    INNER JOIN dbo.viewINR03bNewValues v

    ON i.Chassis = v.Chassis

    AND i.ModelCode = v.ModelCode

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • The good news is, you have full optimization, so the optimizer didn't time out leaving you with a junk plan. The bad news is, yikes.

    I suspect you may have some out of date or inaccurate statistics. Hard to say only looking at the estimated plan, but it's building a table spool to help support a nested lookup operation that's being called 44,000 times. That's not good. Chris already spotted the conversion. That's going to prevent index use, no matter how good the indexes are. Looks like you could INCLUDE SmmtPeriod in the index [Sales].[dbo].[tblRegHist].[PK_tblRegHist_1__17] (oh god, does that mean there are 17 indexes on the table?). Even if the loop join and table spool stay in place, that'll reduce overhead by getting rid of the lookup operation. The core of the problem is the scan on [Incentives].[dbo].[tblInvoiceNotRegisteredHeader].[_dta_index_tblInvoiceNotRegisteredHeader_14_1934018021__K3_K4_K2_K1_5_6_7_8_9_10_11_12_13_14_15_16_17]. We need to get a filter on that data. The estimates are that this returns 13,000 rows, but it's pulling 38,000 from this index to filter down to 1/3 that size. I'd concentrate there.

    Oh, and as everyone else says, joining and nesting views is a very common code smell. Avoid it at all costs. I realize this means rewriting a JOIN or JOINs that you already wrote in the view, but SQL Server and T-SQL are not object oriented programming systems that facilitate code reuse.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Grant Fritchey (5/15/2014)


    The good news is, you have full optimization, so the optimizer didn't time out leaving you with a junk plan. The bad news is, yikes.

    I suspect you may have some out of date or inaccurate statistics. Hard to say only looking at the estimated plan, but it's building a table spool to help support a nested lookup operation that's being called 44,000 times. That's not good. Chris already spotted the conversion. That's going to prevent index use, no matter how good the indexes are. Looks like you could INCLUDE SmmtPeriod in the index [Sales].[dbo].[tblRegHist].[PK_tblRegHist_1__17] (oh god, does that mean there are 17 indexes on the table?). Even if the loop join and table spool stay in place, that'll reduce overhead by getting rid of the lookup operation. The core of the problem is the scan on [Incentives].[dbo].[tblInvoiceNotRegisteredHeader].[_dta_index_tblInvoiceNotRegisteredHeader_14_1934018021__K3_K4_K2_K1_5_6_7_8_9_10_11_12_13_14_15_16_17]. We need to get a filter on that data. The estimates are that this returns 13,000 rows, but it's pulling 38,000 from this index to filter down to 1/3 that size. I'd concentrate there.

    Oh, and as everyone else says, joining and nesting views is a very common code smell. Avoid it at all costs. I realize this means rewriting a JOIN or JOINs that you already wrote in the view, but SQL Server and T-SQL are not object oriented programming systems that facilitate code reuse.

    Thanks for the replies so far, all.

    As it stands, the query takes 8hrs 30mins+ to run....zoinks!!

    Next steps will be to add the clustered index to tblRegHist, and the column SmmtPeriod to the index [Sales].[dbo].[tblRegHist].[PK_tblRegHist_1__17]. There could well be 17 indexes on there by now, I'll check...

    Grant, thanks for your (particularly eloquent!) reply. You should see that the exe plan has been upgraded to the actual one. I won't pretend I know what a 'Loop Join' or a 'Table spool' are, but that's googleable. Nor would I know where to start slicing down that scan on _dta_index_tblInvoiceNotRegisteredHeader_14_1934018021__K3_K4_K2_K1_5_6_7_8_9_10_11_12_13_14_15_16_17

  • 17 indexes is not a small amount. That alone will negatively impact inserts and updates. So you should make sure you need all those indexes.

    To address the scan, you need to either have a filter of some kind to reduce the amount of data, or fix the conversion so that the optimizer can use the statistics to seek on the index instead of scanning.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Grant Fritchey (5/16/2014)


    To address the scan, you need to either have a filter of some kind to reduce the amount of data, or fix the conversion so that the optimizer can use the statistics to seek on the index instead of scanning.

    I'm in favor of first fixing the conversion.

    I have a demo that shows a 10,000 record result set take 10ms without implicit conversions that gets cranked up to 30 minutes when using an implicit converson - on SSD.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • SQLRNNR (5/16/2014)


    Grant Fritchey (5/16/2014)


    To address the scan, you need to either have a filter of some kind to reduce the amount of data, or fix the conversion so that the optimizer can use the statistics to seek on the index instead of scanning.

    I'm in favor of first fixing the conversion.

    I have a demo that shows a 10,000 record result set take 10ms without implicit conversions that gets cranked up to 30 minutes when using an implicit converson - on SSD.

    I agree. I think I might have typed that in a little haste. Fix the conversion first.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Grant Fritchey (5/17/2014)


    SQLRNNR (5/16/2014)


    Grant Fritchey (5/16/2014)


    To address the scan, you need to either have a filter of some kind to reduce the amount of data, or fix the conversion so that the optimizer can use the statistics to seek on the index instead of scanning.

    I'm in favor of first fixing the conversion.

    I have a demo that shows a 10,000 record result set take 10ms without implicit conversions that gets cranked up to 30 minutes when using an implicit converson - on SSD.

    I agree. I think I might have typed that in a little haste. Fix the conversion first.

    I assuming Chris's new query was meant to sidestep the implied conversion, however, it's THAT query that runs in 8+ hours.

  • Jake Shelton (5/19/2014)


    Grant Fritchey (5/17/2014)


    SQLRNNR (5/16/2014)


    Grant Fritchey (5/16/2014)


    To address the scan, you need to either have a filter of some kind to reduce the amount of data, or fix the conversion so that the optimizer can use the statistics to seek on the index instead of scanning.

    I'm in favor of first fixing the conversion.

    I have a demo that shows a 10,000 record result set take 10ms without implicit conversions that gets cranked up to 30 minutes when using an implicit converson - on SSD.

    I agree. I think I might have typed that in a little haste. Fix the conversion first.

    I assuming Chris's new query was meant to sidestep the implied conversion, however, it's THAT query that runs in 8+ hours.

    Nope. It was meant to show what your original query was supposed to do, in a more eye-friendly manner.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (5/19/2014)


    Jake Shelton (5/19/2014)


    Grant Fritchey (5/17/2014)


    SQLRNNR (5/16/2014)


    Grant Fritchey (5/16/2014)


    To address the scan, you need to either have a filter of some kind to reduce the amount of data, or fix the conversion so that the optimizer can use the statistics to seek on the index instead of scanning.

    I'm in favor of first fixing the conversion.

    I have a demo that shows a 10,000 record result set take 10ms without implicit conversions that gets cranked up to 30 minutes when using an implicit converson - on SSD.

    I agree. I think I might have typed that in a little haste. Fix the conversion first.

    I assuming Chris's new query was meant to sidestep the implied conversion, however, it's THAT query that runs in 8+ hours.

    Nope. It was meant to show what your original query was supposed to do, in a more eye-friendly manner.

    Thanks for the clarification.

  • Investigate all of the tables referenced in this query and list those tables which contain columns [chassis] and [model], making a note of the datatype. You want to make the datatype the same for each column whichever table it’s in, but you also want to use the most appropriate datatype for the data i.e. don’t choose nvarchar over varchar if the data doesn’t require it.

    Next, address indexing of tblRegHist. A PK or at least a clustered index would normally be recommended. This query would definitely benefit from a covering index too, but since it’s a nonsense query you’re using for testing, make a note to remove it afterwards.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply