Execution Plan Help. Enormous Increase in row count in plan

  • It would have to be combination of PatientProfileID and Databaskey in the Source system its PatientProfileID as the PK but we are currently loading multiple source systems into one tables so to prevent duplication of keys it would have to be both columns.

    ***SQL born on date Spring 2013:-)

  • Hi Scott,

    I tried your code and I get this error with the MAX()

    Msg 8124, Level 16, State 1, Line 16

    Multiple columns are specified in an aggregated expression containing an outer reference. If an expression being aggregated contains an outer reference, then that outer reference must be the only column referenced in the expression.

    However if i remove the max it does run the query very fast 30 seconds for 1.6 million rows on the output the Problem is its should be a row count of a 1,060,000.

    You have however sparked my interest in the way you are using the outer apply. That seems pretty darn cool.

    ***SQL born on date Spring 2013:-)

  • Try this:

    ...

    Ethnicity,

    MaritalStatus,

    Preferredlanguage = COALESCE(LAll.Preferredlanguage, Languagelist.Description, 'Unknown'),

    Employmentstatus,

    ...

    OUTER APPLY (

    SELECT Ethnicity = ISNULL(MAX(CASE WHEN Medlistsid = Ethnicitymid

    THEN CASE

    WHEN Description LIKE 'Hisp%' THEN 'Hispanic or Latino'

    WHEN Description LIKE 'Not%' THEN 'Non-Hispanic or Latino'

    WHEN Description LIKE '%Declined%' THEN 'Declined'

    END END), 'Unknown'),

    Maritalstatus = ISNULL(MAX(CASE WHEN Medlistsid = Maritalstatusmid

    THEN Description

    END), 'Unknown'),

    Preferredlanguage = MAX(CASE WHEN Medlistsid = Empstatusmid

    THEN Shortdescription

    END),

    Employmentstatus = ISNULL(MAX(CASE WHEN Medlistsid = Preflanguagemid

    THEN Description

    END), 'N/A')

    FROM (

    SELECT L.Medlistid, L.Description, L.Shortdescription,

    Pp.Ethnicitymid, Pp.Maritalstatusmid, Pp.Empstatusmid, Pp.Preflanguagemid

    FROM Lists L

    WHERE L.Databasekey = Pp.Databasekey AND

    L.Medlistsid IN ( Pp.Ethnicitymid, Pp.Maritalstatusmid, Pp.Empstatusmid, Pp.Preflanguagemid )

    ) AS derived

    ) AS Lall

    ...

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • WOW, you have to be joking. 14 seconds row count on the money. Attached is the execution plan.

    Thank you both so very much. Holy molly :-D:-D

    ***SQL born on date Spring 2013:-)

  • thomashohner (9/30/2016)


    It would have to be combination of PatientProfileID and Databaskey in the Source system its PatientProfileID as the PK but we are currently loading multiple source systems into one tables so to prevent duplication of keys it would have to be both columns.

    I think that this is part of your problem, particularly for something like language. English from database A is going to be exactly the same as English from database B. You should be merging those records together and map the ids from their respective databases to the new id in the data warehouse for all records that reference that record.

    There are only about 6500 languages in the world, and only about 3000 with a significant number of speakers. There is no reason for your language table to have 9000 records.

    I suspect that the same is true for your other lookup tables.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • As a clarification. ETL stand for "Extract/Transform/Load". It sounds like you are skipping the Transform part of ETL.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Yes, Unfortunately we are not using SSIS so this is the "Transformation" from source to dimension using SQl procs.

    ***SQL born on date Spring 2013:-)

  • thomashohner (9/30/2016)


    WOW, you have to be joking. 14 seconds row count on the money. Attached is the execution plan.

    Thank you both so very much. Holy molly :-D:-D

    Great, glad it worked so well!

    But don't forget the clustering issue I mentioned earlier. It looks as if you're still relying on a nonclus index on Lists rather than clustering the table by ( DatabaseKey, MedListsId ). If (almost) all of your lookups are by DatabaseKey -- and it seems they would be -- cluster that way and you can get rid of having to maintain an added nonclus index.

    The same may be true for other tables, I can't tell by just the query here. But they should certainly be reviewed as well.

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • The clustered index issue is hard as these are staging tables and the method we use from change tracking is prone to duplicate inserts sometimes. Its far from ideal. I will say that on the production system with the clustered index the original left join method was significantly faster than outer apply. For my environment and what i'm stuck with using it worked out very, very well.

    Thanks again to you and Lynn for taking time out of your day to look at my issue.

    ***SQL born on date Spring 2013:-)

  • thomashohner (9/30/2016)


    The clustered index issue is hard as these are staging tables and the method we use from change tracking is prone to duplicate inserts sometimes. Its far from ideal. I will say that on the production system with the clustered index the original left join method was significantly faster than outer apply. For my environment and what i'm stuck with using it worked out very, very well.

    Thanks again to you and Lynn for taking time out of your day to look at my issue.

    Sorry I didn't get anything to you, but I have to work it in around my paid work. Glad you were able to get something that works well.

  • Lynn you are always appreciated. You have helped me out so many times. You sir are very appreciated 😀

    ***SQL born on date Spring 2013:-)

  • thomashohner (9/30/2016)


    The clustered index issue is hard as these are staging tables and the method we use from change tracking is prone to duplicate inserts sometimes. Its far from ideal. I will say that on the production system with the clustered index the original left join method was significantly faster than outer apply. For my environment and what i'm stuck with using it worked out very, very well.

    Thanks again to you and Lynn for taking time out of your day to look at my issue.

    Dups won't affect that. You can create a clus index without it being explicitly declared as unique:

    CREATE CLUSTERED INDEX index_name ON dbo.table_name ( ... ) WITH ( FILLFACTOR = ..., ) ON [filegroup];

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • Well duh, i'll call that a senior moment even though I'm only 37 okay, going to give that a try as well.

    ***SQL born on date Spring 2013:-)

Viewing 13 posts - 16 through 27 (of 27 total)

You must be logged in to reply to this topic. Login to reply