Two nearly identical selects, very different performance. Why?

  • Here is a fun one.  2 selects like so...

    select t1.f1, t1.f2, t1.f3  from t1

    inner join t2 on t1.f1 = t2.f2

    left outer join t3 on t1.f1 = t3.f2

    where t1.f2 = 'blah' and t3.f1 is null

    This query Explains almost ideally, 3 index seeks , bookmark lookup, 2 nested loops, filter, done.  Each tables contains ~20 million records.

    Now, replace the list of fields in the Select clause with * (splat) and it still Explains well (identically except for an extra bookmark lookup before the filter to get all the field data (*), as one would expect).  3 index seeks , bookmark lookup, 2 nested loops, extra bookmark lookup, filter, done.

    When we execute them, the * version works and returns the expected result in milliseconds , but the one with explicit fields completes in 5minutes. That is longer than 3 table scans!  I dont get it.  Shouldn't the one with the explicit field list run better?

    Note: indexes and stats are up to date and clean.

  • Have you dropped the execution plans before each run?

     

  • Not explicitly.  I tested them in separate sessions.  May I ask for syntax to do that? 

  • DBCC FREEPROCCACHE

     

  • You might also want to check this out : DBCC DROPCLEANBUFFERS

     

    Those are not to be used on prod environement unless you realllllllllllllly have to .

  • Okay. I did both options preceding each query, with the same basic results. The version with explicit fields took 10 min, while the other too seconds to 2 minutes.

    These queries, incidentally, both return zero results.  We are going to force data to get some rows back and see if that makes a diff.

  • Lol, those complaints will subside when everything gets put back into ram and recompiled. But that can take a while in a test environement .

     

    Let us know how it goes after you get data back.

     

    Also can you tell us how the poucentages differ from one execution plan to the other.... maybe the problem is not the extra step, but a step that takes 3-4 times as long as it should.

  • Even more !

    The entire execution time subtree cost for * is .0130; for the distinct it is 0.0123.  Yet the relative execution times are 360 seconds apart!

     

  • Do you have large text columns or blobs (not even sure how they work) that could explain this?

     

    Or large fragmentation (on the disk).

  • So... I am not crazy.  Okay, well, at least on this subject.

    Disk fragmentation is low, thanks to my buddies in Ops, and all the fields being retrieved are from Table1 of the query, and they are bigint, varchar(10), varchar(10), tinyint.

    But even fragmentation or blobs would not explain why * goes orders of magnitude faster than explicit fields.

    If we list even 1 field explicitly the query goes long.  If we list fields and put a * at the end of the list it screams.  Wacky!  Unfortunately this is a select feeding an INSERT so we have to use explicit fields.  Wow, we use this sort of technique all the time and it doesnt go this way.  Something about one of these tables has got to be bad.  I will post back to here... it is getting personal now. Must conquer MS SQL.

     

  • Lol, good luck.  In the mean time maybe someone else will have something else to propose.

  • Hmmm... Weird.

    Maybe you could look at/post the actual execution plans for both queries?

  • From the original thread :

    where t1.f2 = 'blah' and t3.f1 is null

    This query Explains almost ideally, 3 index seeks , bookmark lookup, 2 nested loops, filter, done.  Each tables contains ~20 million records.

    Now, replace the list of fields in the Select clause with * (splat) and it still Explains well (identically except for an extra bookmark lookup before the filter to get all the field data (*), as one would expect).  3 index seeks , bookmark lookup, 2 nested loops, extra bookmark lookup, filter, done.

  • 🙂 Hi. Back again.  The problem has gone away, but it doesnt make sense. 

    To remind you all, and me, the query was either "select [specific fields]" or "select *" and the execution plans for both were virtually identical (an extra book mark lookup as one would expect for the * version).  But the performance for the [specific fields] was close to 10 minutes, while the [*] version was nearly instantaneous. 

    Well, as a matter of database maintenance, the indexes were rebuilt, and suddenly the "select [specific fields]" version is working "correctly" ... meaning returning results at least as quickly as the [*] version.  The execution plans have not changed one iota.

    Okay, so the index was probably severely fragmented. I shoulda checked that. But,  I still dont understand why select [specific fields] and select * would perform sooooooo differently -- favoring the * -- they should have performed equally poorly.  Does anyone have any hints? 

    I will chock it up to "the great Microsoft whatever" unless someone can logically explain it away.

     

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply