Truncate Table vs. Drop and Create

Question

Post reply

Truncate Table vs. Drop and Create

MIJ

SSCommitted

Points: 1915
More actions
April 22, 2009 at 10:32 am

#133084

Hi all. I'm creating a simple SSIS package to pull some data from a view on the source server and write it to a table on the destination server. We want to completely replace the data in the destination table every time the package runs. So my question is: should I drop & recreate the destination table or just truncate it?

I found this topic when I searched, but it seems to be fairly specific to the OP's situation, and the interesting part boils down to this:

It's better to truncate and have the index already defined when you insert data.

It's cleaner to drop and recreate the tables and indexes before you bcp in data but only marginally so.

Which doesn't really help me decide. Why is it "better" to truncate when it's "cleaner" to drop and recreate? What I'm really looking for is thoughts on best practices, or some set of general criteria that I can apply to help me decide in any given situation.

The specifics of this situation:

- Source server is SQL 2000; the source object is a view that joins several tables and returns about 8000 rows (that number will grow, but slowly and at a predictable rate).

- Destination server is SQL 2005; the destination table was created by the initial run of the data transfer wizard, so there are no indexes or other constraints to speak of (at least, not yet - I was kind of asked to help after the fact on this one).

- The package will live and run on the destination server.

Thanks!

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

Jeffrey Williams SSC Guru Points: 89773 More actions · Answer 1

These are the steps I use for this type of process:

1) Disable all NC indexes on destination table

a) ALTER INDEX IndexName ON table DISABLE;

2) Truncate destination table

a) Minimally logged operation - don't need to recover if we are reloading from source

3) Extract and Load data into table

4) Rebuild all indexes

a) ALTER INDEX ALL ON table REBUILD;

It is generally much faster for the extract/load process if there are no indexes on the table. Using the above process, the only index I have is the clustered index and all indexes will be rebuilt after the table has been loaded, so the load process is fairly quick. It can be even quicker if the load is performed in the same order as the clustered index.

You can't disable the clustered index - or the table is no longer accessible.

Jeffrey Williams
Problems are opportunities brilliantly disguised as insurmountable obstacles.

How to post questions to get better answers faster
Managing Transaction Logs

GSquared SSC Guru Points: 260824 More actions · Answer 2

That's pretty much the same thing I do. Disable indexes, truncate, load, enable indexes. It's very efficient.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

MIJ SSCommitted Points: 1915 More actions · Answer 3

Jeffrey & GSquared, thanks for the procedures and advice. Is it fair to say that the efficiency is primarily due to disabling the indexes, and that it wouldn't make much difference if I drop/create the destination table rather than truncate it? To be clear, truncating the table seems like the natural choice, but I'm trying to understand why or when you might want to drop and recreate instead. I'm looking at various tutorials and the drop/create method seems fairly common - do you think that's just because it's easier to do with the wizard? i.e. it's an easier example/demo but not necessarily how you'd do it in production?

Also, Jeffrey, when you say

...the only index I have is the clustered index and all indexes will be rebuilt after the table has been loaded, so the load process is fairly quick. It can be even quicker if the load is performed in the same order as the clustered index.

does that mean I should sort the source result set? e.g. the destination table has a clustered index on OrderID, so I should sort the source records on OrderID before they're loaded?

Jeffrey Williams SSC Guru Points: 89773 More actions · Answer 4

If you drop and recreate the destination table every time, then you also have to recreate the indexes and re-apply all security to the table. Then, what happens when someone else creates a new index that your process does not know about?

Truncating the table keeps all indexes and permissions and only requires a few seconds to clear the data out, which is also minimally logged.

does that mean I should sort the source result set? e.g. the destination table has a clustered index on OrderID, so I should sort the source records on OrderID before they're loaded?

Sorting the results set prior to loading the data could help reduce the load time and the clustered index rebuild time. I do not bother with this step myself, since the process I use is well within the acceptable performance requirements.

It's really a step that you can take to improve performance if needed - and so far, I have not needed it.

Jeffrey Williams
Problems are opportunities brilliantly disguised as insurmountable obstacles.

How to post questions to get better answers faster
Managing Transaction Logs

MIJ SSCommitted Points: 1915 More actions · Answer 5

Jeffrey Williams (4/22/2009)

If you drop and recreate the destination table every time, then you also have to recreate the indexes and re-apply all security to the table. Then, what happens when someone else creates a new index that your process does not know about?

Cool, this makes a lot of sense and actually seems a bit obvious in hindsight. :blush: But that's the great thing about this site - I get to capitalize on the experience of others! Thanks again for your help.

--MIJ

Jeffrey Williams SSC Guru Points: 89773 More actions · Answer 6

Jeffrey Williams

SSC Guru

Points: 89773

April 22, 2009 at 1:11 pm

#982459

Yes - that is the idea. Glad I could help out.

Jeffrey Williams
Problems are opportunities brilliantly disguised as insurmountable obstacles.

How to post questions to get better answers faster
Managing Transaction Logs

andrew.roberts-985869 SSC Eights! Points: 931 More actions · Answer 7

[/quote]

Which doesn't really help me decide. Why is it "better" to truncate when it's "cleaner" to drop and recreate? What I'm really looking for is thoughts on best practices, or some set of general criteria that I can apply to help me decide in any given situation.

The specifics of this situation:

- Source server is SQL 2000; the source object is a view that joins several tables and returns about 8000 rows (that number will grow, but slowly and at a predictable rate).

- Destination server is SQL 2005; the destination table was created by the initial run of the data transfer wizard, so there are no indexes or other constraints to speak of (at least, not yet - I was kind of asked to help after the fact on this one).

- The package will live and run on the destination server.

Thanks![/quote]

I'll share my experiences on this one too although I'll touch more on the actual extract part.

I see that you're using a view on the SQL2K box to get at the source data. I used to use a similar method when I was loading history into my company's new Datawarehouse box. Source is on 2K and the new DWH is running 2K5.

What happened has the size of the data set returned from the source grew the extract was taking longer and longer until the SSIS package started timing out has the source was not responding in time. The solution was to extract the individual tables unjoined, so flat in effect, to a staging area then perform the actual joins on the destination.

Now you may not want to stage the data but this may be something to consider for the future. The benefits are that you release the source servers resources sooner, minimize time on the network and make better use of the destination servers resources.

Hassie SSCommitted Points: 1631 More actions · Answer 8

Hassie

SSCommitted

Points: 1631

April 26, 2009 at 10:09 am

#984102

Yes I would agree the performance gains in taking the flat tables behind the view and then doing all the work on the destination server is significant especially when not only the row count increases but if there is a complex query attached to the view.

MIJ SSCommitted Points: 1915 More actions · Answer 9

Andrew & Hassie, thanks for your comments. Performance isn't a problem for us right now, but it's nice to know what our options might be when the time comes. In our specific case the source view also limits the amount of data to be migrated; if we copied the constituent tables and then set up the view on the destination we'd be copying significantly more rows than the few thousand we're moving now. I'm not sure when it becomes more efficient to move more data than we strictly need, but I imagine it's the sort of thing we could measure and find out when we need to. Thanks again for the idea.

--MIJ

mak101 Ten Centuries Points: 1111 More actions · Answer 10

mak101

Ten Centuries

Points: 1111

September 9, 2010 at 3:12 pm

#1218925

I am in same situation but little bit different. My destination servers are sql server 2000, is there any command to disable indexes while inserting into tables?

Jeffrey Williams SSC Guru Points: 89773 More actions · Answer 11

Jeffrey Williams

SSC Guru

Points: 89773

September 9, 2010 at 7:52 pm

#1218983

Please open a new thread - you'll get many more people looking at your question.

Jeffrey Williams
Problems are opportunities brilliantly disguised as insurmountable obstacles.

How to post questions to get better answers faster
Managing Transaction Logs