Retrieving Bulk Data

Question

Post reply

Retrieving Bulk Data

martincross

SSC Eights!

Points: 833
More actions
July 19, 2004 at 8:56 am

#84976

Right I have a problem that may not have a solution. We have an archaic, legacy app that is retrieving every record from 1 SQL table, but is retrieving the data one record at a time.

So instead of running SELECT xxxx,xxxx,xxxx FROM GenericReport and working on the result set, it's pulling out one record at a time, i.e. SELECT * FROM GenericReport WHERE UID=x . This table is typically 54,000 records long.

There's not much I can do about the way it's working, but I would like to improve upon the 3 minutes it takes to fetch the data. The server itself is fairly beefy, 2Gigs Ram, RAID 5 SCSI, Dual Xeons.

What does appear to be happening though, is the contents of the table are being read in from the disk, rather than from memory. And this table gets accessed enough that ideally I need an IMDB version of it.

Any suggestions for improving performance for this scenario, bearing in mind there's nothing I can do about the app. that's accessing it in such an inefficient way.

Thanks.

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply

Jim P. SSCrazy Eights Points: 8725 More actions · Answer 1

Jim P.

SSCrazy Eights

Points: 8725

July 19, 2004 at 9:47 am

#515090

Is the table indexed?

Is it possible to create a view on the table...(i.e. Create View GenericReport as SELECT * FROM GenericReportBase WHERE UID= %userlogin% ) at login?

----------------
Jim P.

A little bit of this and a little byte of that can cause bloatware.

martincross SSC Eights! Points: 833 More actions · Answer 2

martincross

SSC Eights!

Points: 833

July 19, 2004 at 10:23 am

#515095

UID is a Unique ID - which it's using to single out one record.

Jim P. SSCrazy Eights Points: 8725 More actions · Answer 3

Jim P.

SSCrazy Eights

Points: 8725

July 19, 2004 at 10:33 am

#515096

My misread - sorry .

But still, would it handle a view instead of the base table. There the view is an order by

Create View GenericReport

as Select UID, field1, field2...

From GenericReportTable

Order By UID

----------------
Jim P.

A little bit of this and a little byte of that can cause bloatware.

rockmoose SSCarpal Tunnel Points: 4571 More actions · Answer 4

rockmoose

SSCarpal Tunnel

Points: 4571

July 19, 2004 at 11:25 am

#515102

Hi,

First make sure that the UID column is indexed.

Also the bottleneck might not be Sql Server, but could be the network or the app.

For the IMDB option You can look at: DBCC PINTABLE

/rockmoose

You must unlearn what You have learnt

rockmoose SSCarpal Tunnel Points: 4571 More actions · Answer 5

rockmoose

SSCarpal Tunnel

Points: 4571

July 19, 2004 at 11:30 am

#515103

Jim,

Order By is not allowed in views the way you proposed.

To order the rows I would suggest Clustered Index on the table.

The view approach would definitely not be faster unless it is an Indexed View, available in the Enterprise Edition.

/rockmoose

You must unlearn what You have learnt

martincross SSC Eights! Points: 833 More actions · Answer 6

martincross

SSC Eights!

Points: 833

July 19, 2004 at 1:35 pm

#515115

Thanks for your help guys,

The UID field has a clustered index, and I've tried out the view idea, and it had no performance benefit.

I'm going to give the DBCC PINTABLE a go and will report back.

rockmoose SSCarpal Tunnel Points: 4571 More actions · Answer 7

rockmoose

SSCarpal Tunnel

Points: 4571

July 19, 2004 at 2:07 pm

#515118

sp_tableoption is an alternative to DBCC PINTABLE ( does the same thing )

use either one that suits the needs.

/rockmoose

You must unlearn what You have learnt

martincross SSC Eights! Points: 833 More actions · Answer 8

PINTABLE did little to improve the performance, even after several runs to ensure that the pages were residing in memory.

Which indicates to me, that SQL isn't the bottleneck here? That's assuming I actually have got it running in memory. SQL's enlarged and static memory footprint, and increased cache pages tend to suggest it is now stuck in memory.

Allen Cui-55137 SSC Guru Points: 51650 More actions · Answer 9

Qouted from BOL.

DBCC PINTABLE does not cause the table to be read into memory. As the pages from the table are read into the buffer cache by normal Transact-SQL statements, they are marked as pinned pages. SQL Server does not flush pinned pages when it needs space to read in a new page. SQL Server still logs updates to the page and, if necessary, writes the updated page back to disk. SQL Server does, however, keep a copy of the page available in the buffer cache until the table is unpinned with the DBCC UNPINTABLE statement.

DBCC PINTABLE is best used to keep small, frequently referenced tables in memory. The pages for the small table are read into memory one time, then all future references to their data do not require a disk read.

Caution Although DBCC PINTABLE can provide performance improvements, it must be used with care. If a large table is pinned, it can start using a large portion of the buffer cache and not leave enough cache to service the other tables in the system adequately. If a table larger than the buffer cache is pinned, it can fill the entire buffer cache. A member of the sysadmin fixed server role must shut down SQL Server, restart SQL Server, and then unpin the table. Pinning too many tables can cause the same problems as pinning a table larger than the buffer cache.

rockmoose SSCarpal Tunnel Points: 4571 More actions · Answer 10

rockmoose

SSCarpal Tunnel

Points: 4571

July 19, 2004 at 3:04 pm

#515124

MartinCross,

Write a loop in QA retriving all the 54k records one by one, and time it.

See how long SQL Server has to work in doing that, might help you determine if sql is the bottleneck.

Curious.

/rockmoose

You must unlearn what You have learnt

hoo-t SSChampion Points: 12713 More actions · Answer 11

hoo-t

SSChampion

Points: 12713

July 19, 2004 at 3:28 pm

#515126

You may want to take a look at the condition of your index as well, by running dbcc showcontig.

Steve

Julian Kuiters SSCertifiable Points: 6585 More actions · Answer 12

We have an archaic, legacy app that is retrieving every record from 1 SQL table, but is retrieving the data one record at a time. ... i.e. SELECT * FROM GenericReport WHERE UID=x .

Your client app is the problem here. SQL Server is going to read what it can from memory, but there may be other more frequently read/write pages in memory. SQL Server decides what is or is not in the buffer cache.

SQL Server also doesn't predict what you *might* do next and read ahead. If you did a SELECT * without the WHERE, SQL Server will read as much as it can from disk to buffer to odbc/output as quickly as possible. But if you ask for each row individually, one by one, then SQL Server treats each of those requests as individual requests, preparing the SQL, optimising, executing, and returning the results. Obvoiusly the overheads in preparing and executing the individual row selects is what is killing you here, and theres nothing really you can do about it.

I hate being stuck with a poorly written app!

Julian Kuiters
juliankuiters.id.au

martincross SSC Eights! Points: 833 More actions · Answer 13

OK, my TSQL isn't the best in the world - but this took 5 minutes instead of 3.

CREATE PROCEDURE dbo.VMR_BenchMark 
AS

DECLARE @iMaxUID Int
DECLARE @iMinUID Int

PRINT 'Start Time: ' + cast( getdate()  as nvarchar(20))

SELECT @iMaxUID = MAX(UID) FROM GenericReport
SELECT @iMinUID = MIN(UID) FROM GenericReport

WHILE @iMinUID <= @iMaxUID
BEGIN
 SELECT * FROM GenericReport WHERE UID = @iMinUID
 SET @iMinUID = @iMinUID + 1
END
PRINT 'End Time: ' + cast(getdate() as nvarchar(20) )
GO

I hate being stuck with crummy apps. too!

Ken Klaft-80806 Say Hey Kid Points: 693 More actions · Answer 14

First of all, my condolences to you. Getting stuck with a lame app is no picnic. I've been there.

This type of situation forces me to <ug. management buzzphrase> think outside the box. If your SQL Server is beefy and the table is tweaked to the best of your ability, look elsewhere. By that, I mean look at the server the app is running on and how it connects to your SQL Server. For the price of a few gigabit network cards and a slice of Cat5e, you might be able to squeeze a little better transfer time out of an app that screams "my developer doesn't know what he's doing."

Hopefully my suggestion (and my use of a lame catchphrase) doesn't get me flamed too much...