Without dynamic SQL, Top X Query

  • I have a stored procedure, currently written using dynamic SQL, that selects the TOP X rows, where X depends on user input.

    I would like to rewrite the SP without the dynamic SQL.

    Replacing the X with an input parameter does not work.

    I have tried SETting ROWCOUNT to X.  This works, but is MUCH slower.

    Any hints would be appreciated.

    Joe

  • I'm not sure if a cursor would run any faster than the SET ROWCOUNT, but it is another option. Create a temp table to hold the results (@Table, not #Table to avoid logging), and FETCH results until end of cursor or until X results obtained.

    Same performance disclaimer as above especially if actual rows might be much greater than X: Create a temp table with an identity column and just select all results into it. Then select * from the temp table where identity column <= X. If you know X will never be greater than Y, but that there may be a lot more than Y results available you can use a TOP Y in the SELECT INTO when populating the temp table to improve performance a bit. If variable based tables (@Table) don't support Identity columns (I don't know without looking) you can either use a #Temp table or a second pass with an UPDATE statement to apply sequential numbering. Since that gets ugly (needs a subselect on itself) I'd just go with the #Temp table.

    I can't seem to come up with an alternative to Dynamic SQL or ROWCOUNT that doesn't involve a temp table at all.

  • Aaron,

    I'll try the cursor, but I'm guessing you're right about the speed.

    Definitely can't select all of the results.  MANY orders of magnitude more than is likely to be requested.  I WILL try the idea of selecting some number much greater than what the user is likely to choose, like 200, and process that using the temp table/identity.

    This may turn out to be not worth it.  I just don't like dynamic SQL.

     

  • Without knowing why the ROWCOUNT apparently performs as poorly as it apparently does, you might try a TOP 200 in conjuction with the ROWCOUNT. Just a stab in the dark.

    Also, the reason might simply be because you aren't running in Dynamic SQL. If the Dynamic SQL version was prepared using constants in the where clause instead of variables then statistics could cause a different access path to be chosen (since Dynamic SQL reassesses access paths every time). To confirm this you could run it both ways and get an execution plan for each and see what differs. If the plans are the same then there is apparently some sort of overhead with ROWCOUNT limiting (I've never used it yet and don't know) a result set. If the plans differ then the performance difference is likely due to the constants embedded into the dynamic SQL which means that my other two suggestions will perform as poorly as the ROWCOUNT method. In that case you would need to look into restructuring the SQL, changing indexes, or possibly providing index hints (non-ideal).

    Good luck.

  • If the ROWCOUNT xxx is outside the dynamic sql...

    Put it inside the dynamic Sql.

    You would gain performance by avoiding Dynamic Sql if possible.

    /rockmoose


    You must unlearn what You have learnt

  • Yeah, the whole point of the question was how to do it without the dynamic SQL.

    I converted it to non-dynamic, selecting everything, expecting ROWCOUNT to limit the result.  It worked but was very slow.

  • Ok, No Idea.

    Don't see why SET ROWCOUNT would make it much slower compared to TOP.

    You could try to hint:

    OPTION( FAST 200 ) in the select , see if it makes any difference.

    /rockmoose


    You must unlearn what You have learnt

  • Worth a try.  Thanks.

  • Another approach which will ONLY be practical if the maximum value of [X] is small and your query is relatively short: 

    Use a case statement with MAX(X) or MAX(X + 1) discrete cases (assuming you want the "TOP 0" case in there also).  

    It would be a long proc, but should be quick to implement with cut & paste, and should run fast enough... <grin>

    - john

     

     

  • John,

    Good thought.  Impractical in this case, though.  I currently have NO idea what the users are going to select, so I can't code myself into that kind of corner (already done it often enough in this project!).

    I'll keep the technique in mind, though.  Thanks.

  • I've run into exactly this problem.  Setting rowcount performed to slowly....look at the estimated execution plan for a hint as to why this is (BOL: "The setting of SET ROWCOUNT is set at execute or run time and not at parse time."):

    use pubs

    set rowcount 1

    select * from authors

    set rowcount 0

    select * from authors

     

    So, to get around this I created a view that encompassed my query.  Then, using dynamic SQL, I call the view as "Select top X * from View".  This means the joins, etc in the Query are "compiled", while only the calling of the view is dynamic and doesn't re-use the execution plan. 

    Hope this helps.

    Signature is NULL

  • Now THAT'S nice. 

    There are other variables besides the TOP X, such as start/end date/time, organizational units, etc.  The underlying table has about 7 million rows, and is updated daily, from about 25 sources, at irregular intervals.

    Does the view still sound like a viable option?  I have little experience with them.

    Thanks.

  • Sure...a view works well for these things.  It's a good way to encapsulate some complicated join syntax; with the added benefit of "compiling" the execution plan.

    Now, the where clause CAN easily change the execution plan.  With 7 million records every column in your where clause should be indexed; that will help more than anything.  Still, this certainly doesn't make the view less viable.

    cl

    Signature is NULL

  • That's good news, Calvin.

    I'll try it in the next build of the product.

  • Hi,

    I am doing it like this:

    select top 10 <columns list> from <table> where <condition> and <unique key> not in ( select top x <unique key> from <table> where <condition> )

    When x is 0 you will get top 10 rows

    When x is 10 you will get next top 10 rows

    and so on.......

    Note: the <condition> in both querys must be same.

     

     

    Regards,

     

     

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply