Query Performs MUCH better when parameter in a variable

  • declare @xdate varchar(8)

    select @xdate = '20071130'

    SELECT

    a.Field1, a.Field2, a.Field3, max(c.Field1)

    FROM

    TableA a

    INNER JOIN TableB b ON a.FieldX = b.FieldX

    LEFT JOIN TableC c on a.FieldY = c.FieldY AND a.FieldZ = c.FieldZ

    WHERE

    a.Verified IN ('X', 'F', 'N', '')

    AND a.XDate = '20071130'

    --AND a.XDate = @xdate

    GROUP BY

    a.Field1, A.Field2, A.Field3

    >Table =10M rows, TableB=10M rows, TableC=19M rows

    >There is a 1 to 1 relationship between TableA and Table B - FieldX is the PK in both tables

    >TableC has a compound index on FieldA, FieldB - it is not the PK in the table

    >There is an index in place on TableA.XDate - approx 20,000 records per day out of 10M total

    When the query is run with a hardcoded date, the query plan indicates that the XDate index is used on TableA to limit the rows selected. The optimizer then joins properly to TableB on the PKs - but 77M rows are pulled from TableC (which is 4x the number of actual rows in the table - one row for each item in the IN statement???) The 500 summary records return in about 8 minutes.

    When the hardcoded date is removed and the date is passed in through a variable, the query uses the indexes on all 3 tables properly and it returns the same 500 summary records in 2-3 seconds.

    I'm interested in thoughts as to why the query runs in 3 seconds when a variable is used - and 8 minutes when the date is hardcoded. I have found that several queries in the system are suffering the same strange behavior.

    Please don't critique the design - it has been in place for years and I'm not able to make any changes. The system has a rule-based front-end and many of the queries are dynamic SQL generated by the rules engine.

    Thanks,

    Harley

  • Is is the same for all dates? Meaning hard coding different dates.

    It's possible that there's a different level of selectivity and a different plan is used.

  • Steve,

    Dates from mid-Nov back all behave the same and seem to operate properly when run either way - the more recent dates are subject to the odd pattern - the strange thing though is that the same date, say '20071130', behaves badly when hardcoded and performs excellent when passed in a variable....

    By the way, I enjoy your columns.

    Thanks for the response,

    Harley

  • Could it be because of the execution plan re-use? I.e., you execute the query first for mid-november and the plan is cached. Then when you run it for the recent dates as parameters SQL is re-using the plan but if you run for hard-coded dates it generates a new non-optimal plan every time.

  • that is very strange. Can you look at the plans for one good and one bad execution and see a difference?

    Or post them?

  • Steve,

    The plans are very different - index seek on xDate in both - then loop join to TableB - then

    > the poor plan scans TableC and shows 77M records retrieved

    >the other plan shows a loop join to TableC and 500 records retrieved.......

    It may have been bad stats - we have been updating stats nightly with default sampling - updating the stats with 100% sampling seems to have brought the plan back to normal with identical performance with either a hardcoded date or a date passed in a variable.

    Thanks,

    Harley

  • Thanks for the update. You might have some weird selectivity there with the data.

    Good to know it's working.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply