Novice needing better performance from query

Question

Post reply

Novice needing better performance from query

TfromCart

SSC Rookie

Points: 32
More actions
December 28, 2005 at 1:17 pm

#166377

This seemingly simple query is running way too long. Any suggestions on how I can improve it?

select shipment.shipment Shipment, shipment.store_no DC,

shipped.container Container, max(carton.scandate) Last_ScanD,

count(distinct(carton_no)) Total_Cartons,

a.Unscanned_Cart

from

(select shipment, count(carton_no) Unscanned_Cart

from carton where

scandate = '12/30/1899'

group by shipment)

a, shipment, shipped, tacalendar, tamaster, carton

where shipment.shipment = tamaster.name

and tamaster.id_ta = tacalendar.id_ta

and shipment.shipment = carton.shipment

and shipment.shipment = shipped.shipment

and shipment.shipment = a.shipment

and tamaster.calendarmodule = 'SHIPMENT'

and tacalendar.name = 'RCV - Receive DC'

and shipment.completed = 0

and tacalendar.act_finish = '12/30/1899'

group by shipment.shipment, shipment.store_no, shipped.container, a.unscanned_cart

Any help would be appreciated!

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply

Charles Otten Default port Points: 1475 More actions · Answer 1

I would suggest you rewrite your SQL into ANSI accepted standards. Using commas instead of explicitly using JOIN and since you are using commas you don't use the ON clause. I realize the end result of data is identical but how SQL Server has to parse the T-SQL is not.

I would suggest adding the JOIN's and ON's and possibly a with(nolock) if appropriate. While it should help with performance it will definitely improve readability of code and it assists the server to know exactly how to use indexes that exist. Try this code (assuming I have the joins correct) and see if it helps. Note I did not put in the with(nolock) clause on the tables as I'm not sure if the use of the data allows for that:

select shipment.shipment Shipment,

shipment.store_no DC,

shipped.container Container,

max(carton.scandate) Last_ScanD,

count(distinct(carton_no)) Total_Cartons,

a.Unscanned_Cart

from (select shipment, count(carton_no) Unscanned_Cart

from carton where scandate = '12/30/1899' group by shipment) a

join shipment a.shipment = shipment.shipment

join shipped on shipment.shipment = shipped.shipment

join tamaster on shipment.shipment = tamaster.name

join tacalendar on tamaster.id_ta = tacalendar.id_ta

join carton on shipment.shipment = carton.shipment

where tamaster.calendarmodule = 'SHIPMENT'

and tacalendar.name = 'RCV - Receive DC'

and shipment.completed = 0

and tacalendar.act_finish = '12/30/1899'

group by shipment.shipment, shipment.store_no, shipped.container, a.unscanned_cart

If the phone doesn't ring...It's me.

TfromCart SSC Rookie Points: 32 More actions · Answer 2

Thanks for the suggestion Charles. I'm running your query now and it is at around 22 minutes and counting. The problem started when I added the count(distinct(carton_no), but I don't know any other method to get the results I need. There are thousands of shipped records for each shipment and without the distinct it was multipling the count results. Any other ideas?

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 3

Just a little pointer. When you write a count(distinct(whatever)), the distinct operation will be done by running another query. You can look at that query using the plan from query analyser. You can then adjust your indexes accordingly.

Also could you post some sample data along with the required results so that we have a chance to give an alternative solution?

Frank Kalis SSC Guru Points: 111183 More actions · Answer 4

I would suggest you rewrite your SQL into ANSI accepted standards. Using commas instead of explicitly using JOIN and since you are using commas you don't use the ON clause. I realize the end result of data is identical but how SQL Server has to parse the T-SQL is not.

Interesting thesis! JOINing in the WHERE clause is in case of an INNER JOIN still perfectly valid ANSI syntax.

When you compare both queries and their execution plans you'll notice that not only the resultset, but also both plans are identical. The query optimiser is smart enough to figure this out by himself.

--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

Bryant McClellan SSCarpal Tunnel Points: 4337 More actions · Answer 5

Having used both syntaxes I would agree without looking that the execution plan and query result would be the same. Given the amount of execution time I am guessing that we have not seen the final execution plan from the original query, but using Display Estimated Execution Plan in Query Analyzer will give some clues about index usage and what the real plan will be without having to wait on the query to run.

------------
Buy the ticket, take the ride. -- Hunter S. Thompson

TfromCart SSC Rookie Points: 32 More actions · Answer 6

I'm still struggling with this same issue, so I'm going try try and provide a bit more information if anyone can help me out. Below is my query minus the container data. It takes 5 minutes to run and returns a total of 597 records. A sample result set follows...

select shipment.store_no DC, shipment.shipment Shipment,

max(carton.scandate) Last_ScanD,

count (carton_no) Total_Cartons, a.Unscanned_Cart

from

(select shipment, count(carton_no) Unscanned_Cart

from carton where

scandate = '12/30/1899'

group by shipment)

a, shipment, tacalendar, tamaster, carton

where shipment.shipment = tamaster.name

and tamaster.id_ta = tacalendar.id_ta

and shipment.shipment = carton.shipment

and shipment.shipment = a.shipment

and tamaster.calendarmodule = 'SHIPMENT'

and tacalendar.name = 'RCV - Receive DC'

and shipment.completed = 0

and tacalendar.act_finish = '12/30/1899'

group by shipment.shipment, shipment.store_no, a.unscanned_cart

order by shipment.store_no, shipemnt.shipment

Results:

11 27482 1899-12-30 00:00:00.000 176 176

11 27488 2005-12-20 00:00:00.000 352 136

11 27489 2005-12-16 00:00:00.000 209 202

11 27490 2005-12-21 00:00:00.000 165 1

Following is the query with the container data added. It takes 7 minutes to run and returns 959 rows, but the problem is the Total Cartons are no longer correct...the count as been multipled by the number of times that shipment number appears in the shipped table. A result set follows...

select shipment.store_no DC, shipment.shipment Shipment,

shipped.container Container, max(carton.scandate) Last_ScanD,

count (carton_no) Total_Cartons, a.Unscanned_Cart

from

(select shipment, count(carton_no) Unscanned_Cart

from carton where

scandate = '12/30/1899'

group by shipment)

a, shipment, shipped, tacalendar, tamaster, carton

where shipment.shipment = tamaster.name

and tamaster.id_ta = tacalendar.id_ta

and shipment.shipment = carton.shipment

and shipment.shipment = shipped.shipment

and shipment.shipment = a.shipment

and tamaster.calendarmodule = 'SHIPMENT'

and tacalendar.name = 'RCV - Receive DC'

and shipment.completed = 0

and tacalendar.act_finish = '12/30/1899'

group by shipment.shipment, shipment.store_no, shipped.container, a.unscanned_cart

order by shipment.store_no, shipment.shipment

Results:

11 27482 GATU 0094143 1899-12-30 00:00:00.000 2816 176

11 27488 217 6564 0676 2005-12-20 00:00:00.000 3872 136

11 27489 HLXU 3058570 2005-12-16 00:00:00.000 1672 202

11 27490 HLXU 3111545 2005-12-21 00:00:00.000 495 1

My solution was to change the count (carton_no) Total_Cartons to a count (distinct(carton_no). The problem is now my run time is through the roof. In fact, I have yet to let if finish running. The last time I ran it I finally killed it after an hour. That is simply an unacceptable amount of time.

I hope this gives you a bit more info. and maybe a good solution for me...

Thanks again!

Steve Jones - SSC Editor SSC Guru Points: 728159 More actions · Answer 7

Are all the tables indexed on the columns being referenced in the join? For example, is there an index on shipment.shipment?

Have you looked at the query plan to see what is taking a long time? In QA, press CTRL-K and then run the query. You'll have a new tab in the results to examine the plan chosen and see which item takes the longest percentage wise. Look for straight arrows (scans) as well. These can be an indication of poor indexing.

Follow me on Twitter: http://www.twitter.com/way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com

TfromCart SSC Rookie Points: 32 More actions · Answer 8

TfromCart

SSC Rookie

Points: 32

December 29, 2005 at 12:49 pm

#612031

There is a clustered index on the cartons table which seems to be sucking up the most processing time, i.e., 33% each time.

Bryant McClellan SSCarpal Tunnel Points: 4337 More actions · Answer 9

Bryant McClellan

SSCarpal Tunnel

Points: 4337

December 29, 2005 at 12:52 pm

#612033

And is that showing a clustered index seek or a clustered index scan?

------------
Buy the ticket, take the ride. -- Hunter S. Thompson

TfromCart SSC Rookie Points: 32 More actions · Answer 10

TfromCart

SSC Rookie

Points: 32

December 29, 2005 at 12:54 pm

#612035

Scan

Bryant McClellan SSCarpal Tunnel Points: 4337 More actions · Answer 11

Likely that is part of the problem.

Is the clustered index on an identity field perhaps? If so, you could add in your where clause ...And carton.identity_field > 0. That will cause the optimizer the choose a clustered index seek...at least that is the theory here. Without seeing the tables and indexing scheme I am guessing a bit.

Beyond that, any other query that shows an Index Scan, Clustered Index Scan or Table Scan is a potential bottleneck and should be looked at.

------------
Buy the ticket, take the ride. -- Hunter S. Thompson

Charles Otten Default port Points: 1475 More actions · Answer 12

Have you thought of getting the cartons with an inline query instead? I've seen this improve my queries from time to time. Frank K might have some info on if this impacts other processes as he has much more experience than me.

For example:

select shipment.store_no DC, shipment.shipment Shipment,

shipped.container Container, max(carton.scandate) Last_ScanD,

count (carton_no) Total_Cartons,

(select count(carton_no) from carton a where

scandate = '12/30/1899' and a.shipment = shipment.shipment) Unscanned_Cart

from

--a,

shipment, shipped, tacalendar, tamaster, carton

where shipment.shipment = tamaster.name

and tamaster.id_ta = tacalendar.id_ta

and shipment.shipment = carton.shipment

and shipment.shipment = shipped.shipment

--and shipment.shipment = a.shipment

and tamaster.calendarmodule = 'SHIPMENT'

and tacalendar.name = 'RCV - Receive DC'

and shipment.completed = 0

and tacalendar.act_finish = '12/30/1899'

group by shipment.shipment, shipment.store_no, shipped.container, a.unscanned_cart

order by shipment.store_no, shipment.shipment

If the phone doesn't ring...It's me.

sklein SSC Enthusiast Points: 195 More actions · Answer 13

I would try the following

Instead of creating a dynamic table within the select statement, declare a variable table ...

declare @a TABLE ( shipment datatype, carton_no_count int, Unscanned_Cart int, PRIMARY KEY shipment&nbsp

This allows you to actually index the table and seperates the complexity for simplier debugging.

TfromCart SSC Rookie Points: 32 More actions · Answer 14

Thanks for the help all...I finally got a good result. Charles, your suggestion helped. I passed both count statements via the method you suggested and that allowed the query to run faster and give me the result I needed. The final query looks like this:

select shipment.store_no DC, shipment.shipment Shipment,

shipped.container Container,

max(carton.scandate) Last_ScanD,

(select count(carton_no) from carton a where

a.shipment = shipment.shipment) Total_Cartons,

(select count(carton_no) from carton b where

scandate = '12/30/1899' and b.shipment = shipment.shipment) Unscanned_Cart

from

shipment, shipped, tacalendar, tamaster, carton

where shipment.shipment = tamaster.name

and tamaster.id_ta = tacalendar.id_ta

and shipment.shipment = carton.shipment

and shipment.shipment = shipped.shipment

and tamaster.calendarmodule = 'SHIPMENT'

and tacalendar.name = 'RCV - Receive DC'

and shipment.completed = 0

and tacalendar.act_finish = '12/30/1899'

group by shipment.shipment, shipment.store_no, shipped.container

order by shipment.store_no, shipment.shipment

Thanks again!