Retrieving Bulk Data

  • Right I have a problem that may not have a solution.  We have an archaic, legacy app that is retrieving every record from 1 SQL table, but is retrieving the data one record at a time. 

    So instead of running SELECT xxxx,xxxx,xxxx FROM GenericReport and working on the result set, it's pulling out one record at a time, i.e. SELECT * FROM GenericReport WHERE UID=x .  This table is typically 54,000 records long.

    There's not much I can do about the way it's working, but I would like to improve upon the 3 minutes it takes to fetch the data.  The server itself is fairly beefy, 2Gigs Ram, RAID 5 SCSI, Dual Xeons.

    What does appear to be happening though, is the contents of the table are being read in from the disk, rather than from memory.  And this table gets accessed enough that ideally I need an IMDB version of it.

    Any suggestions for improving performance for this scenario, bearing in mind there's nothing I can do about the app. that's accessing it in such an inefficient way.

     

    Thanks.

     

  • Is the table indexed?

    Is it possible to create a view on the table...(i.e. Create View GenericReport as SELECT * FROM GenericReportBase WHERE UID= %userlogin% ) at login?



    ----------------
    Jim P.

    A little bit of this and a little byte of that can cause bloatware.

  • UID is a Unique ID - which it's using to single out one record.

  • My misread - sorry .

    But still, would it handle a view instead of the base table. There the view is an order by

    Create View GenericReport

    as Select UID, field1, field2...

    From GenericReportTable

    Order By UID



    ----------------
    Jim P.

    A little bit of this and a little byte of that can cause bloatware.

  • Hi,

    First make sure that the UID column is indexed.

    Also the bottleneck might not be Sql Server, but could be the network or the app.

    For the IMDB option You can look at: DBCC PINTABLE

    /rockmoose


    You must unlearn what You have learnt

  • Jim,

    Order By is not allowed in views the way you proposed.

    To order the rows I would suggest Clustered Index on the table.

    The view approach would definitely not be faster unless it is an Indexed View, available in the Enterprise Edition.

    /rockmoose


    You must unlearn what You have learnt

  • Thanks for your help guys,

    The UID field has a clustered index, and I've tried out the view idea, and it had no performance benefit.

    I'm going to give the DBCC PINTABLE a go and will report back.

  • sp_tableoption is an alternative to DBCC PINTABLE ( does the same thing )

    use either one that suits the needs.

    /rockmoose


    You must unlearn what You have learnt

  • PINTABLE did little to improve the performance, even after several runs to ensure that the pages were residing in memory.

    Which indicates to me, that SQL isn't the bottleneck here?  That's assuming I actually have got it running in memory.  SQL's enlarged and static memory footprint, and increased cache pages tend to suggest it is now stuck in memory.

  • Qouted from BOL.

    DBCC PINTABLE does not cause the table to be read into memory. As the pages from the table are read into the buffer cache by normal Transact-SQL statements, they are marked as pinned pages. SQL Server does not flush pinned pages when it needs space to read in a new page. SQL Server still logs updates to the page and, if necessary, writes the updated page back to disk. SQL Server does, however, keep a copy of the page available in the buffer cache until the table is unpinned with the DBCC UNPINTABLE statement.

    DBCC PINTABLE is best used to keep small, frequently referenced tables in memory. The pages for the small table are read into memory one time, then all future references to their data do not require a disk read.

    Caution  Although DBCC PINTABLE can provide performance improvements, it must be used with care. If a large table is pinned, it can start using a large portion of the buffer cache and not leave enough cache to service the other tables in the system adequately. If a table larger than the buffer cache is pinned, it can fill the entire buffer cache. A member of the sysadmin fixed server role must shut down SQL Server, restart SQL Server, and then unpin the table. Pinning too many tables can cause the same problems as pinning a table larger than the buffer cache.

  • MartinCross,

    Write a loop in QA retriving all the 54k records one by one, and time it.

    See how long SQL Server has to work in doing that, might help you determine if sql is the bottleneck.

    Curious.

    /rockmoose


    You must unlearn what You have learnt

  • You may want to take a look at the condition of your index as well, by running dbcc showcontig.

    Steve

  • We have an archaic, legacy app that is retrieving every record from 1 SQL table, but is retrieving the data one record at a time.  ... i.e. SELECT * FROM GenericReport WHERE UID=x . 

    Your client app is the problem here. SQL Server is going to read what it can from memory, but there may be other more frequently read/write pages in memory. SQL Server decides what is or is not in the buffer cache.

    SQL Server also doesn't predict what you *might* do next and read ahead. If you did a SELECT * without the WHERE, SQL Server will read as much as it can from disk to buffer to odbc/output as quickly as possible. But if you ask for each row individually, one by one, then SQL Server treats each of those requests as individual requests, preparing the SQL, optimising, executing, and returning the results. Obvoiusly the overheads in preparing and executing the individual row selects is what is killing you here, and theres nothing really you can do about it.

    I hate being stuck with a poorly written app!

     


    Julian Kuiters
    juliankuiters.id.au

  • OK, my TSQL isn't the best in the world - but this took 5 minutes instead of 3.

    CREATE PROCEDURE dbo.VMR_BenchMark 
    AS
    DECLARE @iMaxUID Int
    DECLARE @iMinUID Int
    PRINT 'Start Time: ' + cast( getdate()  as nvarchar(20))
    SELECT @iMaxUID = MAX(UID) FROM GenericReport
    SELECT @iMinUID = MIN(UID) FROM GenericReport
    WHILE @iMinUID <= @iMaxUID
    BEGIN
     SELECT * FROM GenericReport WHERE UID = @iMinUID
     SET @iMinUID = @iMinUID + 1
    END
    PRINT 'End Time: ' + cast(getdate() as nvarchar(20) )
    GO

    I hate being stuck with crummy apps. too!

  • First of all, my condolences to you.  Getting stuck with a lame app is no picnic.  I've been there. 

    This type of situation forces me to <ug.  management buzzphrase> think outside the box.  If your SQL Server is beefy and the table is tweaked to the best of your ability, look elsewhere.  By that, I mean look at the server the app is running on and how it connects to your SQL Server.  For the price of a few gigabit network cards and a slice of Cat5e, you might be able to squeeze a little better transfer time out of an app that screams "my developer doesn't know what he's doing."

    Hopefully my suggestion (and my use of a lame catchphrase) doesn't get me flamed too much...

     

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply