Mysterious performance issue when joining a view

Question

Post reply

Mysterious performance issue when joining a view

Jake Shelton

Hall of Fame

Points: 3109
More actions
May 15, 2014 at 9:02 am

#307950

(edit - updated exe plan and added new indices)

Hi all,

We have a non-performant view, usual story (worked fine in the past but with growth has ground to a crawl), but unusually, it is one which runs fine by itself:

Select *

From viewINR03bNewValues

(no results)

We will join it to a table further down. The table is large, but not monstrous;

Select Chatelet, ModelCode

From tblInvoiceNotRegisteredHeader

(370,000 rows)

In both cases, the result is returned in a couple of seconds.

Now, the mysterious part is, when we do join the two, the statement will take several hours (at best) to run , mysteriously despite the code having 0 rows to return (am I wrong?):

SELECT dbo.tblInvoiceNotRegisteredHeader.*

FROM dbo.tblInvoiceNotRegisteredHeader

INNER JOIN

dbo.viewINR03bNewValues

ON dbo.tblInvoiceNotRegisteredHeader.Chatelet=dbo.viewINR03bNewValues.Chatelet

AND

dbo.tblInvoiceNotRegisteredHeader.ModelCode = dbo.viewINR03bNewValues.ModelCode

GO

I added some indices on the advice of the tuning Wizard, also created stats.

Execution plans (Only the new plan attached)

The old exe plan contained several index scans from 10-20%, total of over 60%, plus a 16% Hash Join and some parallism. These indices had <5% fragmentation.

The new plan still has 40% total index scans but a lot of new non-clustered seeks. However the promised reduction was only 27%, and runtime is unchanged (still in excess of 3 hours). I will drop these indices.

Stats are updated nightly.

This query is one of several levels of of nested views in a sproc.

Any ideas how to troubleshoot this one?

Cheers,

Jake.

---------------------------------------------------------------------------------------------------------------------------------------------------------------------

Materialised view+table....

SELECT dbo.viewINR01.Chatelet, dbo.viewINR01.ModelCode, dbo.viewINR01.FirstInvoiceDate, dbo.viewINR01.GrossTurnover, dbo.tblCalendar.CalendarId,

dbo.viewINR01.IdChannel, dbo.viewINR01.DirectCustomerCode, dbo.tblCalendar.YYYY, dbo.tblCalendar.MM

FROM dbo.tblCalendar INNER JOIN

dbo.viewINR01 ON dbo.tblCalendar.FromDate = dbo.viewINR01.FirstInvoiceDate AND

dbo.tblCalendar.Todate = dbo.viewINR01.FirstInvoiceDate LEFT OUTER JOIN

dbo.tblInvoiceNotRegisteredHeader ON dbo.viewINR01.DirectCustomerCode = dbo.tblInvoiceNotRegisteredHeader.CustomerCode AND

dbo.viewINR01.GrossTurnover = dbo.tblInvoiceNotRegisteredHeader.GrossTurnover AND

dbo.viewINR01.ModelCode = dbo.tblInvoiceNotRegisteredHeader.ModelCode AND

dbo.viewINR01.Chatelet = dbo.tblInvoiceNotRegisteredHeader.Chatelet

WHERE (dbo.tblInvoiceNotRegisteredHeader.Chatelet IS NULL) AND (dbo.tblCalendar.CalendarisationId = 6)

And the table...

CREATE TABLE [dbo].[tblInvoiceNotRegisteredHeader](

[Chassis] [varchar](8) NOT NULL,

[ModelCode] [varchar](3) NOT NULL,

[CustomerCode] [int] NOT NULL,

[GrossTurnover] [decimal](10, 2) NOT NULL,

[EstimatedIncentive] [decimal](18, 0) NOT NULL,

[idCalendar] [char](10) NOT NULL,

[YYYY] [int] NOT NULL,

[MM] [int] NOT NULL,

[IdChannel] [int] NOT NULL,

[Accrue] [bit] NOT NULL,

[Registered] [bit] NOT NULL,

[LogisticCheck] [datetime] NULL,

[DateCreated] [datetime] NULL,

[DateLastAmended] [datetime] NULL,

[DateClosed] [datetime] NULL,

[UserCreated] [int] NULL,

[UserLastAmended] [int] NULL,

CONSTRAINT [PK_tblInvoiceNotRegistered] PRIMARY KEY CLUSTERED

(

[Chassis] ASC,

[ModelCode] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]

) ON [PRIMARY]

GO

SET ANSI_PADDING OFF

GO

ALTER TABLE [dbo].[tblInvoiceNotRegisteredHeader] ADD CONSTRAINT [DF_tblInvoiceNotRegistered_Accrue] DEFAULT (0) FOR [Accrue]

GO

ALTER TABLE [dbo].[tblInvoiceNotRegisteredHeader] ADD CONSTRAINT [DF_tblInvoiceNotRegisteredHeader_Registered] DEFAULT (0) FOR [Registered]

GO

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply

gbritton1 SSCertifiable Points: 6522 More actions · Answer 1

gbritton1

SSCertifiable

Points: 6522

May 15, 2014 at 9:16 am

#1713896

Are the join predicates indexed in the underlying tables?

What does the execution plan show? Look for scans instead of seeks.

Sean Lange SSC Guru Points: 286562 More actions · Answer 2

This looks like you have views joining to views which are joining to views. Nesting views creates some of the worse execution plans.

In order to help we need a few things. Mostly we need to see an actual execution plan, the one you attached is the estimated plan. Also, table and index definitions would be a big help.

Take a look at this article that explains what to post for help with performance issues.

http://qa.sqlservercentral.com/articles/SQLServerCentral/66909/[/url]

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Eirikur Eiriksson SSC Guru Points: 182631 More actions · Answer 3

Eirikur Eiriksson

SSC Guru

Points: 182631

May 15, 2014 at 9:20 am

#1713899

Quick thought, if the view is returning in 20-30ms, that looks like once per row in the table

😎

Jake Shelton Hall of Fame Points: 3109 More actions · Answer 4

Jake Shelton

Hall of Fame

Points: 3109

May 15, 2014 at 9:39 am

#1713905

Everyone,

Have clicked 'Actual Execution Plan' and uploading, along with the newly created indices....

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 5

Implicit conversions in joins:

[Incentives].[dbo].[tblVehicleTurnoverSummary].[ModelCode]=CONVERT_IMPLICIT(nvarchar(3),[Incentives].[dbo].[tblInvoiceNotRegisteredHeader].[ModelCode],0) AND [Incentives].[dbo].[tblVehicleTurnoverSummary].[Chassis]=CONVERT_IMPLICIT(nvarchar(8),[Incentives].[dbo].[tblInvoiceNotRegisteredHeader].[Chassis],0)

? Referenced twice.

[Sales].[dbo].[tblRegHist].[PK_tblRegHist_1__17] has no clustered index and a RID lookup. Add an appropriate clustered index and a covering index.

As Sean pointed out, joining views to views confuses the optimiser – and you too.

Try to incorporate table aliases into your code:

-- view rewritten as NOT EXISTS:



SELECT

	v.Chatelet,

	v.ModelCode,

	v.FirstInvoiceDate,

	v.GrossTurnover,

	c.CalendarId,

	v.IdChannel,

	v.DirectCustomerCode,

	c.YYYY,

	c.MM

FROM dbo.tblCalendar c

INNER JOIN dbo.viewINR01 v

	ON c.FromDate = v.FirstInvoiceDate

	AND c.Todate = v.FirstInvoiceDate

WHERE c.CalendarisationId = 6

	AND NOT EXISTS (

		SELECT 1

		FROM dbo.tblInvoiceNotRegisteredHeader i

		WHERE v.DirectCustomerCode = i.CustomerCode

			AND v.GrossTurnover = i.GrossTurnover

			AND v.ModelCode = i.ModelCode

			AND v.Chatelet = i.Chatelet)

-- query:

SELECT i.*

FROM dbo.tblInvoiceNotRegisteredHeader i

INNER JOIN dbo.viewINR03bNewValues v

	ON i.Chassis = v.Chassis

	AND i.ModelCode = v.ModelCode

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Grant Fritchey SSC Guru Points: 397721 More actions · Answer 6

The good news is, you have full optimization, so the optimizer didn't time out leaving you with a junk plan. The bad news is, yikes.

I suspect you may have some out of date or inaccurate statistics. Hard to say only looking at the estimated plan, but it's building a table spool to help support a nested lookup operation that's being called 44,000 times. That's not good. Chris already spotted the conversion. That's going to prevent index use, no matter how good the indexes are. Looks like you could INCLUDE SmmtPeriod in the index [Sales].[dbo].[tblRegHist].[PK_tblRegHist_1__17] (oh god, does that mean there are 17 indexes on the table?). Even if the loop join and table spool stay in place, that'll reduce overhead by getting rid of the lookup operation. The core of the problem is the scan on [Incentives].[dbo].[tblInvoiceNotRegisteredHeader].[_dta_index_tblInvoiceNotRegisteredHeader_14_1934018021__K3_K4_K2_K1_5_6_7_8_9_10_11_12_13_14_15_16_17]. We need to get a filter on that data. The estimates are that this returns 13,000 rows, but it's pulling 38,000 from this index to filter down to 1/3 that size. I'd concentrate there.

Oh, and as everyone else says, joining and nesting views is a very common code smell. Avoid it at all costs. I realize this means rewriting a JOIN or JOINs that you already wrote in the view, but SQL Server and T-SQL are not object oriented programming systems that facilitate code reuse.

----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

Jake Shelton Hall of Fame Points: 3109 More actions · Answer 7

Grant Fritchey (5/15/2014)

The good news is, you have full optimization, so the optimizer didn't time out leaving you with a junk plan. The bad news is, yikes.

I suspect you may have some out of date or inaccurate statistics. Hard to say only looking at the estimated plan, but it's building a table spool to help support a nested lookup operation that's being called 44,000 times. That's not good. Chris already spotted the conversion. That's going to prevent index use, no matter how good the indexes are. Looks like you could INCLUDE SmmtPeriod in the index [Sales].[dbo].[tblRegHist].[PK_tblRegHist_1__17] (oh god, does that mean there are 17 indexes on the table?). Even if the loop join and table spool stay in place, that'll reduce overhead by getting rid of the lookup operation. The core of the problem is the scan on [Incentives].[dbo].[tblInvoiceNotRegisteredHeader].[_dta_index_tblInvoiceNotRegisteredHeader_14_1934018021__K3_K4_K2_K1_5_6_7_8_9_10_11_12_13_14_15_16_17]. We need to get a filter on that data. The estimates are that this returns 13,000 rows, but it's pulling 38,000 from this index to filter down to 1/3 that size. I'd concentrate there.

Oh, and as everyone else says, joining and nesting views is a very common code smell. Avoid it at all costs. I realize this means rewriting a JOIN or JOINs that you already wrote in the view, but SQL Server and T-SQL are not object oriented programming systems that facilitate code reuse.

Thanks for the replies so far, all.

As it stands, the query takes 8hrs 30mins+ to run....zoinks!!

Next steps will be to add the clustered index to tblRegHist, and the column SmmtPeriod to the index [Sales].[dbo].[tblRegHist].[PK_tblRegHist_1__17]. There could well be 17 indexes on there by now, I'll check...

Grant, thanks for your (particularly eloquent!) reply. You should see that the exe plan has been upgraded to the actual one. I won't pretend I know what a 'Loop Join' or a 'Table spool' are, but that's googleable. Nor would I know where to start slicing down that scan on _dta_index_tblInvoiceNotRegisteredHeader_14_1934018021__K3_K4_K2_K1_5_6_7_8_9_10_11_12_13_14_15_16_17

Grant Fritchey SSC Guru Points: 397721 More actions · Answer 8

17 indexes is not a small amount. That alone will negatively impact inserts and updates. So you should make sure you need all those indexes.

To address the scan, you need to either have a filter of some kind to reduce the amount of data, or fix the conversion so that the optimizer can use the statistics to seek on the index instead of scanning.

----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

SQLRNNR SSC Guru Points: 281324 More actions · Answer 9

Grant Fritchey (5/16/2014)

To address the scan, you need to either have a filter of some kind to reduce the amount of data, or fix the conversion so that the optimizer can use the statistics to seek on the index instead of scanning.

I'm in favor of first fixing the conversion.

I have a demo that shows a 10,000 record result set take 10ms without implicit conversions that gets cranked up to 30 minutes when using an implicit converson - on SSD.

Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events

Grant Fritchey SSC Guru Points: 397721 More actions · Answer 10

SQLRNNR (5/16/2014)

Grant Fritchey (5/16/2014)

To address the scan, you need to either have a filter of some kind to reduce the amount of data, or fix the conversion so that the optimizer can use the statistics to seek on the index instead of scanning.

I'm in favor of first fixing the conversion.

I have a demo that shows a 10,000 record result set take 10ms without implicit conversions that gets cranked up to 30 minutes when using an implicit converson - on SSD.

I agree. I think I might have typed that in a little haste. Fix the conversion first.

----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

Jake Shelton Hall of Fame Points: 3109 More actions · Answer 11

Grant Fritchey (5/17/2014)

SQLRNNR (5/16/2014)

Grant Fritchey (5/16/2014)

To address the scan, you need to either have a filter of some kind to reduce the amount of data, or fix the conversion so that the optimizer can use the statistics to seek on the index instead of scanning.

I'm in favor of first fixing the conversion.

I have a demo that shows a 10,000 record result set take 10ms without implicit conversions that gets cranked up to 30 minutes when using an implicit converson - on SSD.

I agree. I think I might have typed that in a little haste. Fix the conversion first.

I assuming Chris's new query was meant to sidestep the implied conversion, however, it's THAT query that runs in 8+ hours.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 12

Jake Shelton (5/19/2014)

Grant Fritchey (5/17/2014)

SQLRNNR (5/16/2014)

Grant Fritchey (5/16/2014)

To address the scan, you need to either have a filter of some kind to reduce the amount of data, or fix the conversion so that the optimizer can use the statistics to seek on the index instead of scanning.

I'm in favor of first fixing the conversion.

I have a demo that shows a 10,000 record result set take 10ms without implicit conversions that gets cranked up to 30 minutes when using an implicit converson - on SSD.

I agree. I think I might have typed that in a little haste. Fix the conversion first.

I assuming Chris's new query was meant to sidestep the implied conversion, however, it's THAT query that runs in 8+ hours.

Nope. It was meant to show what your original query was supposed to do, in a more eye-friendly manner.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Jake Shelton Hall of Fame Points: 3109 More actions · Answer 13

ChrisM@Work (5/19/2014)

Jake Shelton (5/19/2014)

Grant Fritchey (5/17/2014)

SQLRNNR (5/16/2014)

Grant Fritchey (5/16/2014)

To address the scan, you need to either have a filter of some kind to reduce the amount of data, or fix the conversion so that the optimizer can use the statistics to seek on the index instead of scanning.

I'm in favor of first fixing the conversion.

I have a demo that shows a 10,000 record result set take 10ms without implicit conversions that gets cranked up to 30 minutes when using an implicit converson - on SSD.

I agree. I think I might have typed that in a little haste. Fix the conversion first.

I assuming Chris's new query was meant to sidestep the implied conversion, however, it's THAT query that runs in 8+ hours.

Nope. It was meant to show what your original query was supposed to do, in a more eye-friendly manner.

Thanks for the clarification.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 14

Investigate all of the tables referenced in this query and list those tables which contain columns [chassis] and [model], making a note of the datatype. You want to make the datatype the same for each column whichever table it’s in, but you also want to use the most appropriate datatype for the data i.e. don’t choose nvarchar over varchar if the data doesn’t require it.

Next, address indexing of tblRegHist. A PK or at least a clustered index would normally be recommended. This query would definitely benefit from a covering index too, but since it’s a nonsense query you’re using for testing, make a note to remove it afterwards.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden