Fine tuning Queries

Question

Post reply

Fine tuning Queries

SQLPain

Hall of Fame

Points: 3522
More actions
August 20, 2015 at 11:27 am

#319974

Can anyone look at the following queries and suggest a more efficient way of writing them. It would be a good practice for some of you experts.

1) Retrieve loan records, passing in Start and End Date values:

SELECT applicant.*, credit_application.* FROM credit_application

INNER JOIN applicant ON credit_application.application_id = applicant.application_id

WHERE credit_application.decision_date >= 'mm/dd/yyyy'

AND credit_application.decision_date <= 'mm/dd/yyyy 23:59:59.992'

AND credit_application.delete_app = 0

AND (credit_application.decision_status = 'C' OR credit_application.decision_status = 'D')

ORDER BY credit_application.decision_status

Save Application ID, Product ID, Applicant.cust_id

2)Retrieve source office and region based on application id:

SELECT credit_application.src_office_id, source_office.name, region.region_name From credit_application

INNER JOIN source_office On source_office.src_office_id = credit_application.src_office_id

INNER JOIN region ON source_office.region_id = region.region_id

WHERE credit_application.application_id = <application id>

3)Retrieve credit bureau name information:

SELECT cb_region.region_name FROM cb_region

INNER JOIN cb_report On cb_report.cb_id = cb_region.cb_id

INNER JOIN cb_rpt_ownership ON cb_rpt_ownership.cb_rpt_id = cb_report.cb_rpt_id

WHERE cb_rpt_ownership.application_id = <application id>

AND cb_report.cb_region_id = cb_region.cb_region_id

AND cb_rpt_ownership.cust_id = <cust id>

AND cb_region.cb_id = <cb id> (cb id values = 1, 2, 3)

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

Luis Cazares SSC Guru Points: 183686 More actions · Answer 1

Other that removing the * and include only the needed columns, there's not much advice I could give with the information provided.

For further assistance, please read the following article: http://qa.sqlservercentral.com/articles/SQLServerCentral/66909/

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Alan Burstein SSC Guru Points: 61136 More actions · Answer 2

Read the article Luis posted.

Without DDL, sample data there's not much help we can provide. That said, here's a couple bits of advice.

If you don't need that ORDER BY clause then get rid of it. If you don't need all the columns in your 1st SELECT statement you should specify only the ones you need and index accordingly.

You could also update your WHERE clause in the first query to:

WHERE credit_application.decision_status IN ('c','d').

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

SQLPain Hall of Fame Points: 3522 More actions · Answer 3

SQLPain

Hall of Fame

Points: 3522

August 20, 2015 at 2:23 pm

#1821548

Thank you Guys

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 4

mufadalhaiderster (8/20/2015)

It would be a good practice for some of you experts.

Heh... actually, it would be good practice for YOU! 😉 Download Grant Frichey's free book on execution plans and learn something new. :w00t:

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

ScottPletcher SSC Guru Points: 100262 More actions · Answer 5

The key to best performance is the best clustering index.

For what's shown, the only potential change that's needed is for the credit_application table. Perhaps it should be clustered on decision_date, even if that date changes once (typically only once, very rarely 2+ times).

Also, for date/datetime[2], it's best to write compares as >= and < rather than >= and <=.

WHERE credit_application.decision_date >= 'yyyymmdd'

AND credit_application.decision_date < 'yyyymm{dd+1}'

SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 6

For too many reasons to explain here, SQL Server really, really likes NOT NULL, UNIQUE, ever increasing, narrow clustered indexes. A date time column can be very good for that except that it's not unique. To get around that and some other problems with using a date/time as the clustered index, use the date column as the first column of the clustered index and the PK column as the second column. Yes, that will make it a bit wider but your queries will eventually thank you for it.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

Ed Wagner SSC Guru Points: 287015 More actions · Answer 7

If you haven't downloaded Grant's book on execution plans, I would do it. It's an excellent book that explains a lot of things and it would give you good practice. While I don't see anything glaring (except what Luis and Alan have already pointed out), I do see a potential "gotcha" in your future. With the first query, you have the date range:

'mm/dd/yyyy 23:59:59.992'

If the decision_date column is a datetime, then by all means, change that to 'mm/dd/yyyy 23:59:59.997'.

The datetime data type has a resolution of 0.003 seconds, or 3 ms. By using 92, you're capturing .990 and nothing else. You'll miss 0.993 and 0.997. I know this is is splitting seconds, but I've been busted by the datetime resolution before. That's when I decided to really dig into the way SQL Server datetime data type and see how it works under the hood. Take my word for it, if your system is in production long enough, you'll eventually get hit by it.

Tally Tables - Performance Personified
String Splitting with True Performance
Best practices on how to ask questions

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 8

Ed Wagner (8/21/2015)

If you haven't downloaded Grant's book on execution plans, I would do it. It's an excellent book that explains a lot of things and it would give you good practice. While I don't see anything glaring (except what Luis and Alan have already pointed out), I do see a potential "gotcha" in your future. With the first query, you have the date range:

'mm/dd/yyyy 23:59:59.992'

If the decision_date column is a datetime, then by all means, change that to 'mm/dd/yyyy 23:59:59.997'.

The datetime data type has a resolution of 0.003 seconds, or 3 ms. By using 92, you're capturing .990 and nothing else. You'll miss 0.993 and 0.997. I know this is is splitting seconds, but I've been busted by the datetime resolution before. That's when I decided to really dig into the way SQL Server datetime data type and see how it works under the hood. Take my word for it, if your system is in production long enough, you'll eventually get hit by it.

While all that is true and I know that Ed knows this, I strongly recommend against using any time values other than "00:00:00.000" when trying to establish whole-date ranges. Using the WHERE clause from the query that started this post, I'd change this...



WHERE credit_application.decision_date >= 'mm/dd/yyyy'

 AND credit_application.decision_date <= 'mm/dd/yyyy 23:59:59.992'

... to this...

WHERE credit_application.decision_date >= @StartDate



 AND credit_application.decision_date  < DATEADD(dd,1,@EndDate)

Of course, both variables need to have midnight times to them or be of one of the whole-date datatypes.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

ScottPletcher SSC Guru Points: 100262 More actions · Answer 9

Jeff Moden (8/21/2015)

For too many reasons to explain here, SQL Server really, really likes NOT NULL, UNIQUE, ever increasing, narrow clustered indexes. A date time column can be very good for that except that it's not unique. To get around that and some other problems with using a date/time as the clustered index, use the date column as the first column of the clustered index and the PK column as the second column. Yes, that will make it a bit wider but your queries will eventually thank you for it.

Uniqueness is mainly relevant only if you are coming in via a nonclustered index. If you do most of your access via the clustered index, as you would in this case, it doesn't particularly matter. It's certainly at most a secondary issue. The primary issue is to get the best clustered key on the table, then you can worry about uniqueness if it's a factor.

SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 10

ScottPletcher (8/22/2015)

Jeff Moden (8/21/2015)

For too many reasons to explain here, SQL Server really, really likes NOT NULL, UNIQUE, ever increasing, narrow clustered indexes. A date time column can be very good for that except that it's not unique. To get around that and some other problems with using a date/time as the clustered index, use the date column as the first column of the clustered index and the PK column as the second column. Yes, that will make it a bit wider but your queries will eventually thank you for it.

Uniqueness is mainly relevant only if you are coming in via a nonclustered index. If you do most of your access via the clustered index, as you would in this case, it doesn't particularly matter. It's certainly at most a secondary issue. The primary issue is to get the best clustered key on the table, then you can worry about uniqueness if it's a factor.

I do, absolutely agree that selecting the proper clustered index is first and foremost. However, since every nonclustered index contains the key columns of the Clustered Index and since it's a rare thing that someone would create only a Clustered Index on a table of any decent size, having a Unique Clustered Index is almost always relevant. Please see the following video as to why, especially starting at time stamp 13:35.

http://download.microsoft.com/download/2/F/8/2F83DD0D-5764-4493-82BF-6AD252EC297B/HDI-ITPro-TechNet-mp4video-MCM_06_ClusteredIndexDebate.m4v

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

Ed Wagner SSC Guru Points: 287015 More actions · Answer 11

Jeff Moden (8/22/2015)

Ed Wagner (8/21/2015)

If you haven't downloaded Grant's book on execution plans, I would do it. It's an excellent book that explains a lot of things and it would give you good practice. While I don't see anything glaring (except what Luis and Alan have already pointed out), I do see a potential "gotcha" in your future. With the first query, you have the date range:

'mm/dd/yyyy 23:59:59.992'

If the decision_date column is a datetime, then by all means, change that to 'mm/dd/yyyy 23:59:59.997'.

The datetime data type has a resolution of 0.003 seconds, or 3 ms. By using 92, you're capturing .990 and nothing else. You'll miss 0.993 and 0.997. I know this is is splitting seconds, but I've been busted by the datetime resolution before. That's when I decided to really dig into the way SQL Server datetime data type and see how it works under the hood. Take my word for it, if your system is in production long enough, you'll eventually get hit by it.

While all that is true and I know that Ed knows this, I strongly recommend against using any time values other than "00:00:00.000" when trying to establish whole-date ranges. Using the WHERE clause from the query that started this post, I'd change this...

WHERE credit_application.decision_date >= 'mm/dd/yyyy' AND credit_application.decision_date <= 'mm/dd/yyyy 23:59:59.992'

... to this...

WHERE credit_application.decision_date >= @StartDate
AND credit_application.decision_date < DATEADD(dd,1,@EndDate)

Of course, both variables need to have midnight times to them or be of one of the whole-date datatypes.

That absolutely works, of course. The key is to use the < instead of the <=. It is definitely cleaner.

Tally Tables - Performance Personified
String Splitting with True Performance
Best practices on how to ask questions