Strange performance boost by query rewrite

Question

Post reply

Strange performance boost by query rewrite

otykier

SSC Veteran

Points: 247
More actions
April 24, 2011 at 2:23 pm

#235216

Hi folks.

Being a complete newbie in sql query optimization, I'd like some help in understanding the following. Imagine the following scenario:

-- Table of non-overlapping integer intervals. Some 200.000 rows in here. -- Primary Key (clustered index) on ivFrom-column. CREATE TABLE Intervals ( ivFrom int NOT NULL, ivTo int NOT NULL, ivCode char(4) NOT NULL ) -- Table of data where the "dValue" field may or may not lie in an interval defined in above table. Some 2.000.000 rows here. -- Primary Key (clustered index) on dKey-column. -- Non-clustered index on dValue-column. CREATE TABLE Data ( dKey int NOT NULL dValue int NOT NULL )

I am able to perform the following query, to count the number of entries in the data-table, where the dValue field lies in a valid interval, within fractions of a second:

SELECT COUNT(*) FROM Data INNER JOIN Intervals ON dValue BETWEEN ivFrom AND ivTo

However, when I try to query the invalid entries in the data-table, the query performs horribly (I terminated execution after it had run for more than 10 minutes, without having returned a single row):

-- Query A SELECT Data.* FROM Data LEFT JOIN Intervals ON dValue BETWEEN ivFrom AND ivTo WHERE Intervals.ivFrom IS NULL

Here's the funny thing. Now rewrite the above query to this, which should return the exact same result:

-- Query B SELECT b.* FROM Data a INNER JOIN Intervals ON a.dValue BETWEEN ivFrom AND ivTo RIGHT JOIN Data b ON a.dKey = b.dKey WHERE a.dKey IS NULL

...and now I get the entire result set in a few seconds. I really don't understand how adding more complexity to the query (joining a 3rd table), improves the performance so drastically. What am I missing here - why does Query A perform so badly?

Thanks in advance!

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004494 More actions · Answer 1

What do the execution plans look like?

Is it a consistent difference? Run both 5 times, discard the first, average the remaining durations.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

otykier SSC Veteran Points: 247 More actions · Answer 2

otykier

SSC Veteran

Points: 247

April 25, 2011 at 12:37 am

#1315316

The differences are consistent. Query A never completes within the 10 minutes my patience lasts, while Query B always executes in about 3 seconds.

Estimated execution plan for Query A:

Execution plan for Query B:

Gail Shaw SSC Guru Points: 1004494 More actions · Answer 3

Can you post the plans please, not pictures of them. Right-click, save as .sqlplan, zip, attach to your post.

While QueryA is running, can you check what it's current wait type is?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

otykier SSC Veteran Points: 247 More actions · Answer 4

Here you go. Thanks a lot for looking into this 🙂

SELECT * FROM master.dbo.sysprocesses WHERE spid = 54



spid   kpid   blocked waittype waittime             lastwaittype                     ecid   status

------ ------ ------- -------- -------------------- -------------------------------- ------ ------------------------------

54     5472   0       0x00BB   652608               CXPACKET                         0      suspended

54     2160   0       0x0000   0                    SOS_SCHEDULER_YIELD              1      runnable

54     6000   0       0x0000   0                    SOS_SCHEDULER_YIELD              2      runnable

Syed Jahanzaib Bin hassan SSChampion Points: 11410 More actions · Answer 5

There are several indexes are showing Scanning it must be index seek

Index Seek is better than Index scan,work on it

Regards,

Syed Jahanzaib Bin Hassan

MCTS | MCITP | OCA | OCP | OCE | SCJP | IBMCDBA

My Blog

http://www.aureus-salah.com

My Blog
www.aureus-salah.com

Gail Shaw SSC Guru Points: 1004494 More actions · Answer 6

Syed Jahanzaib Bin hassan (4/25/2011)

There are several indexes are showing Scanning it must be index seek

Index Seek is better than Index scan,work on it

There's no way to get index seeks with the query he has. There's no SARGable predicate, hence one table at least must be scanned (the entire table is needed).

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Gail Shaw SSC Guru Points: 1004494 More actions · Answer 7

Ok, the CXPacket is a parallel skew wait. It's just the outer thread waiting for the two parallel branches to finish. Interesting that it's a scheduler yield that the other two have, that's usually indicative of high CPU usage.

Can you try another query form for me, see how it performs...

SELECT *



FROM Data

WHERE NOT EXISTS (SELECT 1 FROM Intervals WHERE dValue BETWEEN ivFrom AND ivTo)

Exists usually performs better than joins, because SQL doesn't have to do the full join (as it does with the left join) it just has to check for matching rows.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Grant Fritchey SSC Guru Points: 397721 More actions · Answer 8

On a side note, I'd suggest taking your cost threshold for parallelism to a much higher number. It looks like it's at the default of 5. I usually bump mine up to 35 or so on OLTP systems.

----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

otykier SSC Veteran Points: 247 More actions · Answer 9

Thanks a lot for the responses so far, everyone.

Using EXISTS does unfortunately not improve the performance. The execution time still exceeds my 10 minute patience limit, and the wait type and estimated execution plans are exactly the same as with Query A, above.

Also, changing the cost threshold for parallelism to a higher number does not seem to make a difference in this case - but I must admit I have no clue what the option means, anyway.

David Burrows SSC Guru Points: 65111 More actions · Answer 10

David Burrows

SSC Guru

Points: 65111

April 25, 2011 at 5:30 am

#1315375

Grant Fritchey (4/25/2011)

... I usually bump mine up to 35 or so on OLTP systems.

Is that a general rule of thumb?

Are there any consequences to that setting?

Far away is close at hand in the images of elsewhere.
Anon.

Gail Shaw SSC Guru Points: 1004494 More actions · Answer 11

I think it has to do with the position of the nested loop join in the plan, and the smaller number of rows in the outer table for the second query. Only a nested loop can handle a join without any equality, hence it's the join used, but it's terribly inefficient on larger row counts.

On the second query, the number of rows in the outer table, and hence the number of executions on the inner table, is smaller, hence faster.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Grant Fritchey SSC Guru Points: 397721 More actions · Answer 12

David Burrows (4/25/2011)

Grant Fritchey (4/25/2011)

... I usually bump mine up to 35 or so on OLTP systems.

Is that a general rule of thumb?

Are there any consequences to that setting?

Not really. Since it's dependent on the cost estimates of execution plans, it's sort of a guess anyway. I've set it as low as 25 and as high as 50 on different servers. It really depends on the queries being run there and whether or not parallelism hurts them or helps them.

----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

ChrisM@home SSC-Insane Points: 24260 More actions · Answer 13

I'd try adding IvTo to that clustered index on IvFrom. Guessing wildly, both versions of the query would take about the same time, and rather less than you're currently getting.

Left join to a range without an index on a range bound is asking for trouble 😀

[font="Arial"]^{Low-hanging fruit picker and defender of the moggies}[/font]

For better assistance in answering your questions, please read this[/url].

Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

otykier SSC Veteran Points: 247 More actions · Answer 14

Hi Chris,

Thanks for your suggestion, but unfortunately, adding ivTo to the clustered index, does not alter the execution plan or the performance at all. I also tried creating a separate non-clustered index on ivTo, but that didn't help either.

This is a real brain teaser. There must be some techniques for optimizing range queries in general? Anyone got experience in that area?