Multiple tables select performance - is 90 seconds normal?

  • Michal Mokros (12/6/2007)


    And what about replacing IN by CTE or temporary table?

    Generally the query engine on short lists like this can do constant scans faster than temp table. In addition, if there is a large list the engine will sometimes create a temp table itself which you can see if it does in the execution plan. However, I am not sure of the exact threshold at which the engine may decide to do so.

  • Antares686 (12/5/2007)


    Looking at it things look good as all the index lookups are seeks. However, I would still review and try all of my recommendations

    1) Try my altered query and see if the execution plan changes, I would hope to see the join between [/BIC/DZFIGL_C02T] [DT] AND mdp.[/BI0/SFISCYEAR] occur first.

    2) Drop the index from

    mdp.[/BIC/FZFIGL_C02] Clustered index on KEY_ZFIGL_C02P -- Keep this one for sure

    nonclustered nonunique index on KEY_ZFIGL_C02T -- Keep this one

    nonclustered nonunique index on KEY_ZFIGL_C02U <-

    nonclustered nonunique index on KEY_ZFIGL_C021 | Depending on if there are other queries

    nonclustered nonunique index on KEY_ZFIGL_C022 | run that would use these I would

    nonclustered nonunique index on KEY_ZFIGL_C023 | probably drop these

    nonclustered nonunique index on KEY_ZFIGL_C024 <-

    nonclustered nonunique index on all the above columns --- Drop this index

    3) Is FISCYEAR in mdp.[/BI0/SFISCYEAR] is a VARCHAR column do

    [S2].[FISCYEAR] = '2007'

    but if it is numeric do like so

    [S2].[FISCYEAR] = 2007

    4) You start in your join with 18 Mill rows then reduces to 1.5 million and then down further. The trick I would like to see is to get something less than 18 Mill in the first join step, doing so might drop the time it takes.

    Hi Antares thanks for that.

    I tried running your modified version of the code, but the query did not run any faster unfortunately.

    I also tried deleting the index on BIC/FZFIGL_C02 that was linked to all columns. But this did not speed up the query either. And the query plan showed that it started to use the first index in the list not the last one, at 11% of the total utiliztion.

    So I think I will have to leave this query as it is. Thanks to all of you for your assistance with this issue - I appreciate it.

    Best Regards,

    Neil

    http://90.212.51.111 domain

  • Hi neil,

    could you please try to move the joins to where clauses as follows, I would be interested to see what the optimizer does with that:

    SELECT [D1].[SID_0COMPANY] AS [SID_0COMPANY]

    FROM mdp.[/BIC/FZFIGL_C02] [F]

    ,mdp.[/BIC/DZFIGL_C02T] [DT]

    ,mdp.[/BI0/SFISCYEAR] [S2]

    ,mdp.[/BIC/DZFIGL_C022] [D2]

    ,mdp.[/BIC/DZFIGL_C023] [D3]

    ,mdp.[/BIC/DZFIGL_C02U] [DU]

    ,mdp.[/BIC/DZFIGL_C024] [D4]

    ,mdp.[/BIC/DZFIGL_C02P] [DP]

    ,mdp.[/BIC/DZFIGL_C021] [D1]

    WHERE [DP].[SID_0CHNGID] = 0

    AND [D1].[SID_0COMPANY] IN ( 21 , 66 , 34 , 36 , 37 , 35 , 33 , 38 , 39 , 112 , 121 )

    AND [DT].[SID_0FISCPER3] = 12

    AND [S2].[FISCYEAR] = N'2007'

    AND [DP].[SID_0RECORDTP] = 0

    AND [DP].[SID_0REQUID] <= 2000000250

    AND [F].[KEY_ZFIGL_C02T] = [DT].[DIMID]

    AND [DT].[SID_0FISCYEAR] = [S2].[SID]

    AND [F].[KEY_ZFIGL_C022] = [D2].[DIMID]

    AND [F].[KEY_ZFIGL_C023] = [D3].[DIMID]

    AND [F].[KEY_ZFIGL_C02U] = [DU].[DIMID]

    AND [F].[KEY_ZFIGL_C024] = [D4].[DIMID]

    AND [F].[KEY_ZFIGL_C02P] = [DP].[DIMID]

    AND [F].[KEY_ZFIGL_C021] = [D1].[DIMID]

    Best Regards,

    Chris Büttner

  • neil (12/6/2007)


    Hi Antares thanks for that.

    I tried running your modified version of the code, but the query did not run any faster unfortunately.

    I also tried deleting the index on BIC/FZFIGL_C02 that was linked to all columns. But this did not speed up the query either. And the query plan showed that it started to use the first index in the list not the last one, at 11% of the total utiliztion.

    So I think I will have to leave this query as it is. Thanks to all of you for your assistance with this issue - I appreciate it.

    Best Regards,

    Neil

    Sometimes that is the way it goes. There are possibly other things but without being onsite able to test and watch the results I can only give you the common ideas. If anything hits me I will let you know.

  • I don't know if we can breach the 90 second barrier you have, but using temp tables (expecially indexed ones) to pre process some of this should pretty much guarantee it never run for 22 minutes...

    Is there some easy way to - up front - find the key values for the 120 recods you're planning on returning? and build the rest of the data from there? Again - not knowing enough - I can only propose, but I can't help but keep thinking "too many moving parts in one query".

    Of course - at that point you're WAY past views....

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Christian Buettner (12/6/2007)


    Hi neil,

    could you please try to move the joins to where clauses as follows, I would be interested to see what the optimizer does with that:

    SELECT [D1].[SID_0COMPANY] AS [SID_0COMPANY]

    FROM mdp.[/BIC/FZFIGL_C02] [F]

    ,mdp.[/BIC/DZFIGL_C02T] [DT]

    ,mdp.[/BI0/SFISCYEAR] [S2]

    ,mdp.[/BIC/DZFIGL_C022] [D2]

    ,mdp.[/BIC/DZFIGL_C023] [D3]

    ,mdp.[/BIC/DZFIGL_C02U] [DU]

    ,mdp.[/BIC/DZFIGL_C024] [D4]

    ,mdp.[/BIC/DZFIGL_C02P] [DP]

    ,mdp.[/BIC/DZFIGL_C021] [D1]

    WHERE [DP].[SID_0CHNGID] = 0

    AND [D1].[SID_0COMPANY] IN ( 21 , 66 , 34 , 36 , 37 , 35 , 33 , 38 , 39 , 112 , 121 )

    AND [DT].[SID_0FISCPER3] = 12

    AND [S2].[FISCYEAR] = N'2007'

    AND [DP].[SID_0RECORDTP] = 0

    AND [DP].[SID_0REQUID] <= 2000000250

    AND [F].[KEY_ZFIGL_C02T] = [DT].[DIMID]

    AND [DT].[SID_0FISCYEAR] = [S2].[SID]

    AND [F].[KEY_ZFIGL_C022] = [D2].[DIMID]

    AND [F].[KEY_ZFIGL_C023] = [D3].[DIMID]

    AND [F].[KEY_ZFIGL_C02U] = [DU].[DIMID]

    AND [F].[KEY_ZFIGL_C024] = [D4].[DIMID]

    AND [F].[KEY_ZFIGL_C02P] = [DP].[DIMID]

    AND [F].[KEY_ZFIGL_C021] = [D1].[DIMID]

    Hi Christian,

    I tried running your query, but no improvement unfortunately. It took 1 minute 25 seconds to run.

    I posted the actual execution plan to

    http://www.korma.co.uk/exeplan_071207.sqlplan

    (need to download this as a file)

    http://90.212.51.111 domain

  • Hm,

    And what happens with the following query?

    I hope it contains no error.

    SELECT D1.[SID_0COMPANY] AS [SID_0COMPANY]

    FROM mdp.[/BIC/DZFIGL_C021] D1

    WHERE EXISTS ( SELECT *

    FROM mdp.[/BIC/FZFIGL_C02] [F]

    JOIN mdp.[/BIC/DZFIGL_C02T] [DT] ON [F].[KEY_ZFIGL_C02T]=[DT].[DIMID]

    JOIN mdp.[/BI0/SFISCYEAR] [S2] ON [DT].[SID_0FISCYEAR]=[S2].[SID]

    JOIN mdp.[/BIC/DZFIGL_C022] [D2] ON [F].[KEY_ZFIGL_C022]=[D2].[DIMID]

    JOIN mdp.[/BIC/DZFIGL_C023] [D3] ON [F].[KEY_ZFIGL_C023]=[D3].[DIMID]

    JOIN mdp.[/BIC/DZFIGL_C02U] [DU] ON [F].[KEY_ZFIGL_C02U]=[DU].[DIMID]

    JOIN mdp.[/BIC/DZFIGL_C024] [D4] ON [F].[KEY_ZFIGL_C024]=[D4].[DIMID]

    JOIN mdp.[/BIC/DZFIGL_C02P] [DP] ON [F].[KEY_ZFIGL_C02P]=[DP].[DIMID]

    where [DP].[SID_0CHNGID]=0

    AND [DT].[SID_0FISCPER3]=12

    AND [S2].[FISCYEAR]=N'2007'

    AND [DP].[SID_0RECORDTP]=0

    AND [DP].[SID_0REQUID]<=2000000250

    AND [F].[KEY_ZFIGL_C021] = [D1].[DIMID])

    AND D1.[SID_0COMPANY] IN (21, 66, 34, 36, 37, 35, 33, 38, 39, 112, 121)

    Best Regards,

    Chris Büttner

  • Go to thinking about this and came up with an idea. The reason why it starts the path thru mdp.[/BIC/DZFIGL_C021] is because it has a fixed list to start with and the engine decided that gave the best path to results and because the final resulting value comes from that table.

    However, if you don't need every listing of mdp.[/BIC/DZFIGL_C021].[SID_0COMPANY] or can live with a count of occurrences see how this performs.

    SELECT

    [D1].[SID_0COMPANY] AS [SID_0COMPANY]

    ,COUNT([D1].[SID_0COMPANY]) AS COUNT_SID_0COMPANY

    FROM

    mdp.[/BIC/FZFIGL_C02] [F]

    INNER JOIN

    mdp.[/BIC/DZFIGL_C02T] [DT]

    INNER JOIN

    mdp.[/BI0/SFISCYEAR] [S2]

    ON

    [DT].[SID_0FISCYEAR] = [S2].[SID]

    AND [S2].[FISCYEAR] = N'2007'

    AND [DT].[SID_0FISCPER3] = 12

    ON

    [F].[KEY_ZFIGL_C02T] = [DT].[DIMID]

    INNER JOIN

    mdp.[/BIC/DZFIGL_C022] [D2]

    ON

    [F].[KEY_ZFIGL_C022] = [D2].[DIMID]

    INNER JOIN

    mdp.[/BIC/DZFIGL_C023] [D3]

    ON

    [F].[KEY_ZFIGL_C023] = [D3].[DIMID]

    INNER JOIN

    mdp.[/BIC/DZFIGL_C02U] [DU]

    ON

    [F].[KEY_ZFIGL_C02U] = [DU].[DIMID]

    INNER JOIN

    mdp.[/BIC/DZFIGL_C024] [D4]

    ON

    [F].[KEY_ZFIGL_C024] = [D4].[DIMID]

    INNER JOIN

    mdp.[/BIC/DZFIGL_C02P] [DP]

    ON

    [F].[KEY_ZFIGL_C02P] = [DP].[DIMID]

    AND [DP].[SID_0CHNGID] = 0

    AND [DP].[SID_0RECORDTP] = 0

    AND [DP].[SID_0REQUID] <= 2000000250

    INNER JOIN

    mdp.[/BIC/DZFIGL_C021] [D1]

    ON

    [F].[KEY_ZFIGL_C021] = [D1].[DIMID]

    GROUP BY

    [D1].[SID_0COMPANY]

    HAVING

    [D1].[SID_0COMPANY] IN ( 21 , 66 , 34 , 36 , 37 , 35 , 33 , 38 , 39 , 112 , 121 )

  • Antares686 (12/10/2007)


    Go to thinking about this and came up with an idea. The reason why it starts the path thru mdp.[/BIC/DZFIGL_C021] is because it has a fixed list to start with and the engine decided that gave the best path to results and because the final resulting value comes from that table.

    However, if you don't need every listing of mdp.[/BIC/DZFIGL_C021].[SID_0COMPANY] or can live with a count of occurrences see how this performs.

    I tried this, but no change - still 1 minute 20 seconds to run.

    However I made progress by simplifying the query a little:

    The join on table mdp.[/BIC/DZFIGL_C02P] which uses code DP, has three AND clauses in the query:

    [DP].[SID_0CHNGID] = 0

    [DP].[SID_0RECORDTP] = 0

    [DP].[SID_0REQUID] <= 2000000250

    I discovered that these conditions are TRUE for all 27 rows in the table.

    So I ran the query again with the table and these WHERE clauses removed and it completed in around 2 seconds!

    This is not a complete fix to the problem because I now need to find out whether the SAP OLAP application can be reconfigured to not use this table (it is called a 'dimension' table in OLAP parlence, with the mdp.[/BIC/FZFIGL_C02] F table being called the 'fact' table ,with 20 million rows).

    But the question remains: Why does this additional join cause the query to go from taking 2 seconds to run, to 90 seconds to run? I am not an expert on SQL relational theory, but my best guess is that it doesn't know that the results of the WHERE clause is ALL the rows in the table, and it won't know that the result of this being joined to the large F table will be ALL the rows of the F table. It therefore has to merge this result into the 'quick' result from the other tables, and this is what takes the time. And because it is trying to merge two calculated results sets indexes are of no help so it is effectively doing a table scan.

    It seems to me that this is a limitation in terms of what SQL Server can do. But I'm just thinking aloud here because I haven't studied the theory behind all this.

    Thanks to all for your help.

    http://90.212.51.111 domain

  • Just a thought and I don't know why I didn't bring this up before but when was the last time you defragged the indexes, updated the statistics (sp_updatestats in 2K, no BOL for 2k5 to check here), and update usage (DBCC UPDATEUSAGE in 2k). Sometimes it is the maintenance that can be the root issue as index stats and fragmentation can cause the query engine to make bad execution plan decisions.

  • Antares686 (12/10/2007)


    Just a thought and I don't know why I didn't bring this up before but when was the last time you defragged the indexes, updated the statistics (sp_updatestats in 2K, no BOL for 2k5 to check here), and update usage (DBCC UPDATEUSAGE in 2k). Sometimes it is the maintenance that can be the root issue as index stats and fragmentation can cause the query engine to make bad execution plan decisions.

    Eurika!

    I ran:

    ALTER INDEX ALL

    ON mdp.[/BIC/FZFIGL_C02]

    REBUILD WITH (ONLINE = ON)

    This ran and ran, and after 5 minutes with it still running I ran the 90-second search query again, and to my surprise it completed in 4 seconds.

    Yet the ALTER INDEX statement contined to run, and I stopped it after 12 minutes. The search query continued to take 4 seconds. Also the client's SAP report also took 4 seconds, and they were happy.

    This is strange because I had run dbcc DBREINDEX on it, and that had not fixed it.

    The query execution plan looked quite different, so it does look like it was a query plan issue, with fragmentation as a possible second issue.

    Thanks again.

    http://90.212.51.111 domain

  • No one has yet (that I saw) asked about the distribution of the data. Indexes are useless (unless covering) if the query optimizer estimates more than roughly high single digits for number of rows returned. Hitting 9 tables with the amounts of data you mention can be very slow - even if everything is optimal - especially if the hardware isn't up to snuff.

    To the OP: You have spent at least 8 days on this now (posted first on 12/4 - probably worked on it for some time prior to that). I guarantee you I could have determined the cause(s) of your performance and developed solution(s) if available in < 8 hours - probably < 2 - if I was connected to your database. Do yourself and your company a favor - hire a pro (there are numerous ones on this and the performance forum) to address this one. Company will benefit in a faster query and you will benefit from the mentoring and knowledge transfer and be able to more quickly/efficiently resolve future performance issues. "Hunting and pecking" something this complex on a forum isn't a good way to address the issue. My $0.02.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (12/12/2007)


    To the OP: You have spent at least 8 days on this now (posted first on 12/4 - probably worked on it for some time prior to that). I guarantee you I could have determined the cause(s) of your performance and developed solution(s) if available in < 8 hours - probably < 2 - if I was connected to your database. Do yourself and your company a favor - hire a pro (there are numerous ones on this and the performance forum) to address this one. Company will benefit in a faster query and you will benefit from the mentoring and knowledge transfer and be able to more quickly/efficiently resolve future performance issues. "Hunting and pecking" something this complex on a forum isn't a good way to address the issue. My $0.02.

    Yes I think you are right. Of course I thought a 9-table join would be simple for you guys 🙂

    But I did also get some telephone support from another SQL Server professional who specialises as a DBA, and his help was useful just as yours was.

    The trouble is I have just come out from a permanent job where I worked with SQL Server around 10% of the time, for 10 years. So I am a little rusty on SQL Server, but I have alot of experience with it. But SQL Server is the only skill that I have that is transferable, so I had to put that on my CV/resume as my main skill. So I got hired as an "expert" dba, and that is where the trouble started.

    But fortunately I am doing OK now, thanks to a little help 😉

    http://90.212.51.111 domain

  • Yes I think you are right. Of course I thought a 9-table join would be simple for you guys 🙂

    I don't think I have met a 9-table join yet that was 'simple'!! LOL

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Having analyzed your query, see what happens when you run it like this (sorry had to trim it down a bit for readability 🙂 )

    SELECT D1.SID_0COMPANY AS SID_0COMPANY

    FROM FZFIGL_C02 F JOIN DZFIGL_C02T DT ON F.KEY_ZFIGL_C02T = DT.DIMID AND DT.SID_0FISCPER3 = 12

    JOIN SFISCYEAR S2 ON DT.SID_0FISCYEAR = S2.SID AND S2.FISCYEAR = N'2007'

    JOIN DZFIGL_C022 D2 ON F.KEY_ZFIGL_C022 = D2.DIMID

    JOIN DZFIGL_C023 D3 ON F.KEY_ZFIGL_C023 = D3.DIMID

    JOIN DZFIGL_C02U DU ON F.KEY_ZFIGL_C02U = DU.DIMID

    JOIN DZFIGL_C024 D4 ON F.KEY_ZFIGL_C024 = D4.DIMID

    JOIN DZFIGL_C02P DP ON F.KEY_ZFIGL_C02P = DP.DIMID AND DP.SID_0CHNGID = 0 AND DP.SID_0RECORDTP = 0 AND DP.SID_0REQUID <= 2000000250

    JOIN DZFIGL_C021 D1 ON F.KEY_ZFIGL_C021 = D1.DIMID AND D1.SID_0COMPANY IN ( 21 , 66 , 34 , 36 , 37 , 35 , 33 , 38 , 39 , 112 , 121 )

    2. The optimizer would have to only look at these fields for indexing...

    SFISCYEAR.SID

    DZFIGL_C02T.DIMID

    DZFIGL_C02T.SID_0FISCYEAR

    FZFIGL_C02.KEY_ZFIGL_C02T

    FZFIGL_C02.KEY_ZFIGL_C022

    FZFIGL_C02.KEY_ZFIGL_C023

    FZFIGL_C02.KEY_ZFIGL_C02U

    FZFIGL_C02.KEY_ZFIGL_C024

    FZFIGL_C02.KEY_ZFIGL_C02P

    FZFIGL_C02.KEY_ZFIGL_C021

    DZFIGL_C021.SID_0COMPANY

    DZFIGL_C021.SID_0FISCPER3

    SFISCYEAR.FISCYEAR

    DZFIGL_C02P.SID_0RECORDTP

    DZFIGL_C02P.SID_0REQUID

    so, I would recomment unbundling some of your indexes from compound to simple or single field indexes.

    keen to know what happens?

    Regards,

    Tbs.

Viewing 15 posts - 16 through 30 (of 30 total)

You must be logged in to reply to this topic. Login to reply