Finding the best way..

Question

Post reply

Finding the best way..

Noman Tariq

SSCommitted

Points: 1750
More actions
April 21, 2009 at 3:16 am

#203191

Everyone,

Please guide me which one is good for big databases.

Scenario:

I have to compare two of my tables one is archived and other is production. i want to update my archive table with respect to the greater modified date, so i have lets say 10000 records which have greater modified date and all that are in my #temp table.

NOW i have two ways:

- i can update those records from temp table to my archive table

and

- i can delete those records from archived and INSERT again from temp to archive table.

Records may increase from 10000 also.

what should i do in order to keep performance issue in mind.??

Thanks,

Noman

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply

Jack Corbett SSC Guru Points: 184388 More actions · Answer 1

I would update the rows that exist and insert the ones that don't. Like this:



UPDATE Archive

    SET COLUMN_list = T.Column_List

FROM

    Archive A JOIN

    #temp T ON

        A.primary_key = T.primary_key

INSERT INTO Archive

    (

    COLUMN_list

    )

    SELECT

        T.COLUMN_list,

    FROM

        #temp T LEFT JOIN

        Archive A ON

            T.primary_key = A.primary_key

    WHERE

        A.primary_key IS NULL

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

Steve Jones - SSC Editor SSC Guru Points: 728159 More actions · Answer 2

Steve Jones - SSC Editor

SSC Guru

Points: 728159

April 21, 2009 at 9:42 am

#981560

I like Jack's solution, though I might update, delete the matches, insert the rest. Not sure which is better.

Follow me on Twitter: http://www.twitter.com/way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com

Wesley Norton SSCrazy Points: 2797 More actions · Answer 3

Noman - I have run into something very similar to this in some of the processes we run, after doing some exhaustive searches I found a combination of a few things that helped and were recommended by a few experts: Transactions and Truncating. Both appear to increase the rate at which a process like yours will take place. I created some sample data (9 columns by 10000 rows) and then created an archive table that had half as many records. I used Jack's answer to check against and below are the stats:

Truncate and Transactions: 62ms average total execution time over three trials same data

Update and then Insert: 168ms average total execution time over three trials same data

Either way you can't blink your eyes faster than this gets done with 10000 records at 9 wide.



Truncate Table Archive

Begin Tran

INSERT INTO Archive

(

COLUMN_list

)

SELECT T.COLUMN_list

FROM #temp T

Commit

Jack Corbett SSC Guru Points: 184388 More actions · Answer 4

Nice one Wesley. Certainly truncate will be fast. Of course any indexes will affect the solutions as well.

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

Wesley Norton SSCrazy Points: 2797 More actions · Answer 5

Wesley Norton

SSCrazy

Points: 2797

April 22, 2009 at 8:16 am

#982209

Jack, how so? I am not familiar enough with indexes to understand your point. Will they have to be rebuilt after a truncate?

Jack Corbett SSC Guru Points: 184388 More actions · Answer 6

Indexes either speed up or slow down the Updates (how's that for being specific) and will slow down the insert, especially for a clustered index.

For example in your test data, if you add a clustered index on audit_id, the update/insert, because you are doing only an update, on my laptop the average was around 140ms and the truncate/insert was around 100ms. I'd have to run a few more tests to really get a good feel, but that's the general idea.

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

Wesley Norton SSCrazy Points: 2797 More actions · Answer 7

Wesley Norton

SSCrazy

Points: 2797

April 22, 2009 at 8:28 am

#982223

So, if I understand correctly, a good index will likely speed up the update process, but have no bearing on the insert?

Jack Corbett SSC Guru Points: 184388 More actions · Answer 8

Here's the way I understand it, and I could be wrong, so don't take this for the definitive answer. Check out Gail Shaw's[/url] and Kimberly Tripp's[/url] blogs, they have some excellent posts on indexes.

If the columns in the index are being updated then the index will likely slow down the update because SQL Server has to update the index pages as well. In the case of an index on a primary key that will NOT be updated it should "find" the rows to update faster and thus the update will be faster.

Inserts will probably be a bit slower because the index pages will have to be updated as well, particularly if the insert does an insert in the middle of a clustered index. This is why, especially in OLTP systems, it is recommended that your clustered index be on a monotonically increasing value like an identity column or a created date. Think about it like this, if I have a clustered index on last name and I insert 'Aaron, Henry' which will be at the beginning of the index, every row has to be moved, so that insert will be slower than if I have an identity column as the clustered index and insert 'Aaron, Henry'.

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

Wesley Norton SSCrazy Points: 2797 More actions · Answer 9

Wesley Norton

SSCrazy

Points: 2797

April 22, 2009 at 8:56 am

#982248

Thanks, Jack your response, and the links, proved very informative.

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 10

I sense extreme danger... someone tell me why you'd truncate a perfectly good archive table.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

ABAS101 SSCommitted Points: 1894 More actions · Answer 11

If you truncate the table, then you will lose all the data to include anything that may not be in your current production file. Is that what you really want to do? Also, what if there is a record in your archive table that just happens to have a modified date within the range you are pulling down from Production but the record is no long in Production? If you simply delete out everything from your archive table within that date range, you are going to lose that record. Again, I am not sure that is what you want. I'd say update matching records first, then insert the new ones. Just my take but I am not sure what your exact business needs are.

Wesley Norton SSCrazy Points: 2797 More actions · Answer 12

I had thoughts about that too. The conclusion I came to was that it's not truly an archive table, as I know it. An archive table would be filled with data that was correct on a certain date. In the processes I have created recently there needed to be a way to check a table for changes, or multiple tables for changes across one person/place/thing. Triggers are not desired, so what I was left with was comparing an image (archive) of the table to its current state. So that's one reason. The other is what the original poster stated:

NOW i have two ways:

- i can update those records from temp table to my archive table

and

- i can delete those records from archived and INSERT again from temp to archive table.

...what should i do in order to keep performance issue in mind.??

Jeff, I loved the "I sense extreme danger", almost had coffee in my sinuses.