Execution Plan Help. Enormous Increase in row count in plan

Question

Post reply

Execution Plan Help. Enormous Increase in row count in plan

Viewing 13 posts - 16 through 27 (of 27 total)

You must be logged in to reply to this topic. Login to reply

thomashohner SSCrazy Eights Points: 9856 More actions · Answer 1

thomashohner

SSCrazy Eights

Points: 9856

September 30, 2016 at 11:33 am

#1904229

It would have to be combination of PatientProfileID and Databaskey in the Source system its PatientProfileID as the PK but we are currently loading multiple source systems into one tables so to prevent duplication of keys it would have to be both columns.

***SQL born on date Spring 2013:-)

thomashohner SSCrazy Eights Points: 9856 More actions · Answer 2

Hi Scott,

I tried your code and I get this error with the MAX()

Msg 8124, Level 16, State 1, Line 16

Multiple columns are specified in an aggregated expression containing an outer reference. If an expression being aggregated contains an outer reference, then that outer reference must be the only column referenced in the expression.

However if i remove the max it does run the query very fast 30 seconds for 1.6 million rows on the output the Problem is its should be a row count of a 1,060,000.

You have however sparked my interest in the way you are using the outer apply. That seems pretty darn cool.

***SQL born on date Spring 2013:-)

ScottPletcher SSC Guru Points: 100262 More actions · Answer 3

Try this:



...

          Ethnicity,

          MaritalStatus,

          Preferredlanguage = COALESCE(LAll.Preferredlanguage, Languagelist.Description, 'Unknown'),

          Employmentstatus,

...

	  OUTER APPLY (

          SELECT Ethnicity = ISNULL(MAX(CASE WHEN Medlistsid = Ethnicitymid

	             THEN CASE

                     WHEN Description LIKE 'Hisp%' THEN 'Hispanic or Latino'

                     WHEN Description LIKE 'Not%' THEN 'Non-Hispanic or Latino'

                     WHEN Description LIKE '%Declined%' THEN 'Declined'

                     END END), 'Unknown'),

                 Maritalstatus = ISNULL(MAX(CASE WHEN Medlistsid = Maritalstatusmid

                     THEN Description

                     END), 'Unknown'),

                 Preferredlanguage = MAX(CASE WHEN Medlistsid = Empstatusmid

                     THEN Shortdescription

                     END),

                 Employmentstatus = ISNULL(MAX(CASE WHEN Medlistsid = Preflanguagemid

                     THEN Description

                     END), 'N/A')

          FROM (

              SELECT L.Medlistid, L.Description, L.Shortdescription,

                  Pp.Ethnicitymid, Pp.Maritalstatusmid, Pp.Empstatusmid, Pp.Preflanguagemid

	          FROM Lists L

	          WHERE L.Databasekey = Pp.Databasekey AND

	              L.Medlistsid IN ( Pp.Ethnicitymid, Pp.Maritalstatusmid, Pp.Empstatusmid, Pp.Preflanguagemid )

          ) AS derived

	  ) AS Lall

...

SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

thomashohner SSCrazy Eights Points: 9856 More actions · Answer 4

thomashohner

SSCrazy Eights

Points: 9856

September 30, 2016 at 12:12 pm

#1904237

WOW, you have to be joking. 14 seconds row count on the money. Attached is the execution plan.

Thank you both so very much. Holy molly :-D:-D

***SQL born on date Spring 2013:-)

drew.allen SSC Guru Points: 76776 More actions · Answer 5

thomashohner (9/30/2016)

It would have to be combination of PatientProfileID and Databaskey in the Source system its PatientProfileID as the PK but we are currently loading multiple source systems into one tables so to prevent duplication of keys it would have to be both columns.

I think that this is part of your problem, particularly for something like language. English from database A is going to be exactly the same as English from database B. You should be merging those records together and map the ids from their respective databases to the new id in the data warehouse for all records that reference that record.

There are only about 6500 languages in the world, and only about 3000 with a significant number of speakers. There is no reason for your language table to have 9000 records.

I suspect that the same is true for your other lookup tables.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

drew.allen SSC Guru Points: 76776 More actions · Answer 6

drew.allen

SSC Guru

Points: 76776

September 30, 2016 at 12:28 pm

#1904242

As a clarification. ETL stand for "Extract/Transform/Load". It sounds like you are skipping the Transform part of ETL.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

thomashohner SSCrazy Eights Points: 9856 More actions · Answer 7

thomashohner

SSCrazy Eights

Points: 9856

September 30, 2016 at 12:32 pm

#1904243

Yes, Unfortunately we are not using SSIS so this is the "Transformation" from source to dimension using SQl procs.

***SQL born on date Spring 2013:-)

ScottPletcher SSC Guru Points: 100262 More actions · Answer 8

thomashohner (9/30/2016)

WOW, you have to be joking. 14 seconds row count on the money. Attached is the execution plan.

Thank you both so very much. Holy molly :-D:-D

Great, glad it worked so well!

But don't forget the clustering issue I mentioned earlier. It looks as if you're still relying on a nonclus index on Lists rather than clustering the table by ( DatabaseKey, MedListsId ). If (almost) all of your lookups are by DatabaseKey -- and it seems they would be -- cluster that way and you can get rid of having to maintain an added nonclus index.

The same may be true for other tables, I can't tell by just the query here. But they should certainly be reviewed as well.

SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

thomashohner SSCrazy Eights Points: 9856 More actions · Answer 9

The clustered index issue is hard as these are staging tables and the method we use from change tracking is prone to duplicate inserts sometimes. Its far from ideal. I will say that on the production system with the clustered index the original left join method was significantly faster than outer apply. For my environment and what i'm stuck with using it worked out very, very well.

Thanks again to you and Lynn for taking time out of your day to look at my issue.

***SQL born on date Spring 2013:-)

Lynn Pettis SSC Guru Points: 442447 More actions · Answer 10

thomashohner (9/30/2016)

The clustered index issue is hard as these are staging tables and the method we use from change tracking is prone to duplicate inserts sometimes. Its far from ideal. I will say that on the production system with the clustered index the original left join method was significantly faster than outer apply. For my environment and what i'm stuck with using it worked out very, very well.

Thanks again to you and Lynn for taking time out of your day to look at my issue.

Sorry I didn't get anything to you, but I have to work it in around my paid work. Glad you were able to get something that works well.

thomashohner SSCrazy Eights Points: 9856 More actions · Answer 11

thomashohner

SSCrazy Eights

Points: 9856

September 30, 2016 at 2:01 pm

#1904257

Lynn you are always appreciated. You have helped me out so many times. You sir are very appreciated 😀

***SQL born on date Spring 2013:-)

ScottPletcher SSC Guru Points: 100262 More actions · Answer 12

thomashohner (9/30/2016)

The clustered index issue is hard as these are staging tables and the method we use from change tracking is prone to duplicate inserts sometimes. Its far from ideal. I will say that on the production system with the clustered index the original left join method was significantly faster than outer apply. For my environment and what i'm stuck with using it worked out very, very well.

Thanks again to you and Lynn for taking time out of your day to look at my issue.

Dups won't affect that. You can create a clus index without it being explicitly declared as unique:

CREATE CLUSTERED INDEX index_name ON dbo.table_name ( ... ) WITH ( FILLFACTOR = ..., ) ON [filegroup];

SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

thomashohner SSCrazy Eights Points: 9856 More actions · Answer 13

thomashohner

SSCrazy Eights

Points: 9856

September 30, 2016 at 2:24 pm

#1904260

Well duh, i'll call that a senior moment even though I'm only 37 okay, going to give that a try as well.

***SQL born on date Spring 2013:-)