Fine tuning Queries

  • Can anyone look at the following queries and suggest a more efficient way of writing them. It would be a good practice for some of you experts.

    1) Retrieve loan records, passing in Start and End Date values:

    SELECT applicant.*, credit_application.* FROM credit_application

    INNER JOIN applicant ON credit_application.application_id = applicant.application_id

    WHERE credit_application.decision_date >= 'mm/dd/yyyy'

    AND credit_application.decision_date <= 'mm/dd/yyyy 23:59:59.992'

    AND credit_application.delete_app = 0

    AND (credit_application.decision_status = 'C' OR credit_application.decision_status = 'D')

    ORDER BY credit_application.decision_status

    Save Application ID, Product ID, Applicant.cust_id

    2)Retrieve source office and region based on application id:

    SELECT credit_application.src_office_id, source_office.name, region.region_name From credit_application

    INNER JOIN source_office On source_office.src_office_id = credit_application.src_office_id

    INNER JOIN region ON source_office.region_id = region.region_id

    WHERE credit_application.application_id = <application id>

    3)Retrieve credit bureau name information:

    SELECT cb_region.region_name FROM cb_region

    INNER JOIN cb_report On cb_report.cb_id = cb_region.cb_id

    INNER JOIN cb_rpt_ownership ON cb_rpt_ownership.cb_rpt_id = cb_report.cb_rpt_id

    WHERE cb_rpt_ownership.application_id = <application id>

    AND cb_report.cb_region_id = cb_region.cb_region_id

    AND cb_rpt_ownership.cust_id = <cust id>

    AND cb_region.cb_id = <cb id> (cb id values = 1, 2, 3)

  • Other that removing the * and include only the needed columns, there's not much advice I could give with the information provided.

    For further assistance, please read the following article: http://qa.sqlservercentral.com/articles/SQLServerCentral/66909/

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Read the article Luis posted.

    Without DDL, sample data there's not much help we can provide. That said, here's a couple bits of advice.

    If you don't need that ORDER BY clause then get rid of it. If you don't need all the columns in your 1st SELECT statement you should specify only the ones you need and index accordingly.

    You could also update your WHERE clause in the first query to:

    WHERE credit_application.decision_status IN ('c','d').

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Thank you Guys

  • mufadalhaiderster (8/20/2015)


    It would be a good practice for some of you experts.

    Heh... actually, it would be good practice for YOU! 😉 Download Grant Frichey's free book on execution plans and learn something new. :w00t:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • The key to best performance is the best clustering index.

    For what's shown, the only potential change that's needed is for the credit_application table. Perhaps it should be clustered on decision_date, even if that date changes once (typically only once, very rarely 2+ times).

    Also, for date/datetime[2], it's best to write compares as >= and < rather than >= and <=.

    WHERE credit_application.decision_date >= 'yyyymmdd'

    AND credit_application.decision_date < 'yyyymm{dd+1}'

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • For too many reasons to explain here, SQL Server really, really likes NOT NULL, UNIQUE, ever increasing, narrow clustered indexes. A date time column can be very good for that except that it's not unique. To get around that and some other problems with using a date/time as the clustered index, use the date column as the first column of the clustered index and the PK column as the second column. Yes, that will make it a bit wider but your queries will eventually thank you for it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • If you haven't downloaded Grant's book on execution plans, I would do it. It's an excellent book that explains a lot of things and it would give you good practice. While I don't see anything glaring (except what Luis and Alan have already pointed out), I do see a potential "gotcha" in your future. With the first query, you have the date range:

    'mm/dd/yyyy 23:59:59.992'

    If the decision_date column is a datetime, then by all means, change that to 'mm/dd/yyyy 23:59:59.997'.

    The datetime data type has a resolution of 0.003 seconds, or 3 ms. By using 92, you're capturing .990 and nothing else. You'll miss 0.993 and 0.997. I know this is is splitting seconds, but I've been busted by the datetime resolution before. That's when I decided to really dig into the way SQL Server datetime data type and see how it works under the hood. Take my word for it, if your system is in production long enough, you'll eventually get hit by it.

  • Ed Wagner (8/21/2015)


    If you haven't downloaded Grant's book on execution plans, I would do it. It's an excellent book that explains a lot of things and it would give you good practice. While I don't see anything glaring (except what Luis and Alan have already pointed out), I do see a potential "gotcha" in your future. With the first query, you have the date range:

    'mm/dd/yyyy 23:59:59.992'

    If the decision_date column is a datetime, then by all means, change that to 'mm/dd/yyyy 23:59:59.997'.

    The datetime data type has a resolution of 0.003 seconds, or 3 ms. By using 92, you're capturing .990 and nothing else. You'll miss 0.993 and 0.997. I know this is is splitting seconds, but I've been busted by the datetime resolution before. That's when I decided to really dig into the way SQL Server datetime data type and see how it works under the hood. Take my word for it, if your system is in production long enough, you'll eventually get hit by it.

    While all that is true and I know that Ed knows this, I strongly recommend against using any time values other than "00:00:00.000" when trying to establish whole-date ranges. Using the WHERE clause from the query that started this post, I'd change this...

    WHERE credit_application.decision_date >= 'mm/dd/yyyy'

    AND credit_application.decision_date <= 'mm/dd/yyyy 23:59:59.992'

    ... to this...

    WHERE credit_application.decision_date >= @StartDate

    AND credit_application.decision_date < DATEADD(dd,1,@EndDate)

    Of course, both variables need to have midnight times to them or be of one of the whole-date datatypes.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden (8/21/2015)


    For too many reasons to explain here, SQL Server really, really likes NOT NULL, UNIQUE, ever increasing, narrow clustered indexes. A date time column can be very good for that except that it's not unique. To get around that and some other problems with using a date/time as the clustered index, use the date column as the first column of the clustered index and the PK column as the second column. Yes, that will make it a bit wider but your queries will eventually thank you for it.

    Uniqueness is mainly relevant only if you are coming in via a nonclustered index. If you do most of your access via the clustered index, as you would in this case, it doesn't particularly matter. It's certainly at most a secondary issue. The primary issue is to get the best clustered key on the table, then you can worry about uniqueness if it's a factor.

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • ScottPletcher (8/22/2015)


    Jeff Moden (8/21/2015)


    For too many reasons to explain here, SQL Server really, really likes NOT NULL, UNIQUE, ever increasing, narrow clustered indexes. A date time column can be very good for that except that it's not unique. To get around that and some other problems with using a date/time as the clustered index, use the date column as the first column of the clustered index and the PK column as the second column. Yes, that will make it a bit wider but your queries will eventually thank you for it.

    Uniqueness is mainly relevant only if you are coming in via a nonclustered index. If you do most of your access via the clustered index, as you would in this case, it doesn't particularly matter. It's certainly at most a secondary issue. The primary issue is to get the best clustered key on the table, then you can worry about uniqueness if it's a factor.

    I do, absolutely agree that selecting the proper clustered index is first and foremost. However, since every nonclustered index contains the key columns of the Clustered Index and since it's a rare thing that someone would create only a Clustered Index on a table of any decent size, having a Unique Clustered Index is almost always relevant. Please see the following video as to why, especially starting at time stamp 13:35.

    http://download.microsoft.com/download/2/F/8/2F83DD0D-5764-4493-82BF-6AD252EC297B/HDI-ITPro-TechNet-mp4video-MCM_06_ClusteredIndexDebate.m4v

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden (8/22/2015)


    Ed Wagner (8/21/2015)


    If you haven't downloaded Grant's book on execution plans, I would do it. It's an excellent book that explains a lot of things and it would give you good practice. While I don't see anything glaring (except what Luis and Alan have already pointed out), I do see a potential "gotcha" in your future. With the first query, you have the date range:

    'mm/dd/yyyy 23:59:59.992'

    If the decision_date column is a datetime, then by all means, change that to 'mm/dd/yyyy 23:59:59.997'.

    The datetime data type has a resolution of 0.003 seconds, or 3 ms. By using 92, you're capturing .990 and nothing else. You'll miss 0.993 and 0.997. I know this is is splitting seconds, but I've been busted by the datetime resolution before. That's when I decided to really dig into the way SQL Server datetime data type and see how it works under the hood. Take my word for it, if your system is in production long enough, you'll eventually get hit by it.

    While all that is true and I know that Ed knows this, I strongly recommend against using any time values other than "00:00:00.000" when trying to establish whole-date ranges. Using the WHERE clause from the query that started this post, I'd change this...

    WHERE credit_application.decision_date >= 'mm/dd/yyyy'

    AND credit_application.decision_date <= 'mm/dd/yyyy 23:59:59.992'

    ... to this...

    WHERE credit_application.decision_date >= @StartDate

    AND credit_application.decision_date < DATEADD(dd,1,@EndDate)

    Of course, both variables need to have midnight times to them or be of one of the whole-date datatypes.

    That absolutely works, of course. The key is to use the < instead of the <=. It is definitely cleaner.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply