Fast Insert

Question

Post reply

Fast Insert

Gregg Dave

Say Hey Kid

Points: 680
More actions
May 14, 2015 at 5:05 am

#398460

which way is faster .

Inserting data into a table having clustered Index or Inserting data into a table then create clustered index.

Insertion happens from a different table and its about 1million rows

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

SQL.AP Hall of Fame Points: 3244 More actions · Answer 1

SQL.AP

Hall of Fame

Points: 3244

May 14, 2015 at 5:09 am

#1797789

If you are sure your source data is unique then First Load data into your table and then create clustered index on it. It would be faster.

____________________________________________________________

AP

ScottPletcher SSC Guru Points: 100262 More actions · Answer 2

It's faster to create the clustered index first, because if necessary you can always explicitly sort the data being into cluster key order for the insert. Typically SQL will sort it that way automatically, but check the query plan, and if it doesn't, specify it yourself.

For example, say new_table is clustered on ( cola, colf ):

INSERT INTO new_table

SELECT ...

FROM ...

ORDER BY cola, colf

SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 3

As always, "It Depends". For example, no one has mentioned using the high performance "SELECT/INTO" method, which works wonders even if minimal logging isn't possible due to the current Recovery Model. The ONLY way to know is to try it because "your mileage may vary" depending on the nature of the table and a bunch of other variables.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

sickb00000y Old Hand Points: 313 More actions · Answer 4

Here's a script and its execution plan containing the estimated stats for testing two different ways of inserting data into a table (INSERT vs SELECT INTO). I've used SQL Sentry Plan Explorer to get actual stats, and inserting into a table with a clustered index is faster (in this test) by ~200ms for 100,000 rows than inserting into a table and adding the index after. As Jeff Moden says above, "It Depends", but at least you've got something here to look at.

Eirikur Eiriksson SSC Guru Points: 182631 More actions · Answer 5

Gregg Dave (5/14/2015)

which way is faster .

Inserting data into a table having clustered Index or Inserting data into a table then create clustered index.

Insertion happens from a different table and its about 1million rows

As Jeff already said, it depends, suggest you have a look at this article, Minimally Logged Inserts

😎

Gregg Dave Say Hey Kid Points: 680 More actions · Answer 6

sickb00000y (5/14/2015)

Here's a script and its execution plan containing the estimated stats for testing two different ways of inserting data into a table (INSERT vs SELECT INTO). I've used SQL Sentry Plan Explorer to get actual stats, and inserting into a table with a clustered index is faster (in this test) by ~200ms for 100,000 rows than inserting into a table and adding the index after. As Jeff Moden says above, "It Depends", but at least you've got something here to look at.

Thanks sickb00000y,

The result was better by 8 seconds when i tried with 5million rows with table having Index.

ScottPletcher SSC Guru Points: 100262 More actions · Answer 7

ScottPletcher

SSC Guru

Points: 100262

May 15, 2015 at 8:31 am

#1798210

That's not surprising once you think through the process. Loading the table and then clustering it requires all the rows to be inserted twice.

SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

Eric M Russell SSC Guru Points: 125255 More actions · Answer 8

Here are a handful of approaches, listed from what's generally fastest (select / into) to slowest.

Select into a new table (ie: SELECT * INTO B FROM A).

Select into an empty table.

Select into a table with no indexes, foreign keys, or triggers.

Select into a table with indexs, et all, but order by clustering key.

Select into a table with indexs,et all, and don't specify order by.

Basically what you're trying to do is minimize transaction logging, avoid on-the-fly foreign key lookups and index writes, and avoiding page splitting (inserting non-sorted records into a clustered index).

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

ScottPletcher SSC Guru Points: 100262 More actions · Answer 9

That's a good point about the minimal logging.

But from at least SQL 2008 on, you can get minimal logging on the insert into the existing empty clustered index as well, by doing the following:

1) Insuring the insert is in clustered key order, as I noted above.

AND (either

A) Enable trace flag 610 (this has other considerations and should be vetted before being done).

OR

B) Specify the TABLOCK hint in the load.

INSERT INTO clus_table ( TABLOCK )

SELECT ...

FROM ...

ORDER BY clus_key1, clus_key2

)

Edit: Sorry for not mentioning that earlier. I always sort the rows and specify TABLOCK for a new table load by habit. On some servers I have TF610, but not on others, but it still doesn't hurt to specify TABLOCK on table initial load.

SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!