T-SQL Loop Question

  • Hello, I have written a loop to transfer data from one table to another. In the code below what can I do to have it increment to the next 100000 rows as to not insert duplicate rows in the target table?

    DECLARE

    @rows INT

    BEGIN TRY

    WHILE 1 = 1

    BEGIN

    insert into outlet_sku_xref_test

    ([retailer_id]

    ,[sku_id]

    ,[outlet_id]

    ,[threshold]

    ,[qty]

    ,[sku_price]

    ,[tax_rate]

    ,[sell_thru]

    ,[inv_status_1]

    ,[inv_status_2]

    ,[inv_status_3]

    ,[inv_status_4]

    ,[inv_status_5]

    ,[rec_update_date]

    ,[rec_create_date]

    ,[rec_update_id])

    select top (100000)

    [retailer_id]

    ,[sku_id]

    ,[outlet_id]

    ,[threshold]

    ,[qty]

    ,[sku_price]

    ,[tax_rate]

    ,[sell_thru]

    ,[inv_status_1]

    ,[inv_status_2]

    ,[inv_status_3]

    ,[inv_status_4]

    ,[inv_status_5]

    ,[rec_update_date]

    ,[rec_create_date]

    ,[rec_update_id]

    from outlet_sku_xref

    IF @@RowCount = 0 BREAK

    END

    END TRY

    BEGIN CATCH

    SELECT ERROR_NUMBER() ErrorNBR, ERROR_SEVERITY() Severity,

    ERROR_LINE() ErrorLine, ERROR_MESSAGE() Msg

    PRINT 'The Insert Did Not Complete Successfully...'

    END CATCH

    GO

    Best Regards,

    ~David

  • Is there a column that you can use for something like "Where ID between 1000 * (@Iteration-1) + 1 and 1000 * @Iteration"? @Iteration = 1 for the first time through the loop, and increments by 1 each pass through the loop. Thus, it will start with 1-1000, then 1001-2000, and so on.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I have written a loop to transfer data from one table to another. In the code below what can I do to have it increment to the next 100000 rows as to not insert duplicate rows in the target table?

    The issue with this is that a set has no ordering, so you can hardly say the first 100.000, the next 100.000. What you could do is apply a WHERE clause and say for example ... WHERE retailer_id BETWEEN 100001 AND 200000, provided they are numbered without gaps. Otherwise ROW_NUMBER() might help, but you will have to verify if performance is still acceptable for you as you would select the full table content then to get a row number for all rows.

    brgds

    Philipp Post

    brgds

    Philipp Post

  • There is only a composite unique key.

    Best Regards,

    ~David

  • Can you insert the whole table into a temp table, with an ID on it, and then transfer to the other table with the temp table as the source, in batches at that step?

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • You have a unique index, so you could use it.

    INSERT YOURTABLE( blah,blah2,blah3 etc)

    SELECT blah,blah2,blah3 etc

    FROM YOUROTHERTABLE T1

    LEFT OUTER JOIN YOURTABLE T2

    ON T1.unique_index_part1=T2.unique_index_part1

    AND T1.unique_index_part2=T2.unique_index_part2

    AND T1.unique_index_part3=T2.unique_index_part3

    WHERE T2.some_column_that_does_not_allow_nulls IS NULL

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • [font="Tahoma"]Using a left join is one way, but the code will progressively slow down (er, that doesn't mean it's invalid, and sometimes it's still the best performing solution because other options generally involve additional I/O.)

    Another option is to create a temporary table that lists the key values and has an identity key. You can track the number of the identity key to start from with a simple counter variable, and join the temporary table to the source table to get the range of keys.

    A third option is to use an order by in your insert, and start the loop by getting the last values for the composite key fields in the destination table. You then have a where clause in the insert statement to skip past those keys. That generally takes the form: (A0 > A1 or (A0 = A1 and B0 > B1)).

    [/font]

  • Actually, in SQL 2005, another solution that sometimes works is to use an Output clause to insert into a temp table, then use that to manage the loop. That works best if you have well-ordered data.

    For example, with an ID column, you can Output Into a temp table, select the max value from that and assign it to a variable, truncate the temp table, and insert top 10000 where ID > the variable, then repeat as needed.

    If the key columns involved have an order to them of some sort, then that method can be really, really fast.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • follow up to GSquared's first idea:

    You could add an Identity field to a temp table loaded with your source data. Then load your destination table from that.

    -- create temp source table

    SELECT * INTO #tmpSku FROM outlet_sku_xref

    -- add unique ID field

    Alter table #tmpSku add sku_row_id integer IDENTITY (1,1)

    insert into outlet_sku_xref_test

    select

    ....

    from #tmpSku

    where sku_row_id between 1 and 10000

    insert into outlet_sku_xref_test

    select

    ....

    from #tmpSku

    where sku_row_id between 10001 and 20000

    ...and so on

  • I must be missing something...would it not work if you just removed the loop and inserted the records?

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • slange (4/27/2009)


    I must be missing something...would it not work if you just removed the loop and inserted the records?

    There are times when it's more efficient to do batches like this. Usually if you're transfering huge amounts of data.

    One that I ran into recently was a table with over 100-million rows of data, and I needed to archive about 75% of it into another table. Partitioning was not an option. Just trying to run it all as a single insert followed by a delete would literally have locked up the server for days (low end server). Breaking it up into small batches, however, worked beautifully. Still took a long time, but the regular processes on the server could continue while it chugged away at the archive process.

    Tried various row-counts on the insert, found that it got the most done per hour at about 7k rows per batch, with the least interference with the regular jobs on there.

    Used Insert...Output Into...Select to archive, then used the contents of the temp table I was outputing into for a delete command, all wrapped in a transaction. Looped that with While @@Trancount > 0, and it just ran and ran and got the job done efficiently.

    SQL can handle those huge operations, but many servers aren't up to it.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (4/28/2009)


    slange (4/27/2009)


    I must be missing something...would it not work if you just removed the loop and inserted the records?

    There are times when it's more efficient to do batches like this. Usually if you're transfering huge amounts of data.

    [font="Tahoma"]As a follow up: when you do one big insert, SQL Server does it all as a single transaction. Which means you can blow out the transaction logs and potentially tempdb as well as it rebuilds indexes. While SQL Server is rebuilding indexes and the like, all sorts of locking and resource contention is caused, and for a large update that can have significant impact on performance for anything else running.

    If you break it into smaller "chunks" or "batches", then you don't run the risk of SQL Server running out of capacity or turning your transaction logs into mega-gigabyte behemoths.

    Another valid approach for this sort of operation is to pull the data out into a file and use SSIS or BCP or BULK INSERT to load it and avoid most of the logging. You still get the issue with index rebuilds, but there are options to commit every certain number of rows.

    [/font]

  • Yeah, I've used the export...import method too. That works well in some situations. I've found it quite advantageous when the data has to move across a slow network. FTPing a text file can be a lot faster than any sort of linked server type solution.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Viewing 13 posts - 1 through 12 (of 12 total)

    You must be logged in to reply to this topic. Login to reply