Maintaining Sequence of OUTPUT rows during INSERT

  • We are writing a stored procedure that will be doing an insert to one table followed by an update to another table. (Edited for clarification.)

    The update will set the value of a target column to the newly created identity value(s) from the insert.

    The key to the table to be updated is a GUID which does not exist in the table to be inserted into.

    (Please do not lecture me about GUIDs being keys at this point in time. I understand and agree.)

    Therefore, we have to be able to join the results of the OUTPUT back to the original table-valued parameter to get the GUID so we can subsequently do the update.

    I have created the following solution using "RowIDs" to preserve the sequence of rows in the table. It is working in early testing but I can't find documentation that the OUTPUT rows will always be in the same sequence as the ORDER BY clause of the insert. This is critical or else the RowIDs of the output will not match the RowIDs of the original table-valued parameter.

    Can anyone either break this, or point me to documentation which establishes this is reliable? Or, alternatively, suggest another technique?

    Many thanks in advance.

    - Bob

    CREATE TABLE #COLORS (ColorID int identity(1,1) primary key, Color varchar(20))

    insert into #colors values ('Black'),('White')

    DECLARE @ColorsToAdd Table (RowID int identity(1,1) primary key, RandomGUID uniqueIdentifier, ColorToAdd varchar(20))

    DECLARE @AddedColors Table (RowID int identity(1,1) primary key, AddedColorID int, AddedColor varchar(20))

    insert into @ColorsToAdd(RandomGUID, ColorToAdd)

    select NewID(),'Pink' union all

    select NewID(),'Aqua' union all

    select NewID(),'Teal' union all

    select NewID(),'Gold'

    select '@ColorsToAdd' as [Table],* from @ColorsToAdd

    insert into #colors (color)

    output inserted.ColorID, inserted.color

    into @AddedColors(AddedColorID, AddedColor)

    select ColorToAdd

    from @colorsToAdd

    order by RowID -- DOES THIS CONTRL THE SEQUENCE OF THE OUTPUT CLAUSE ???

    select '@AddedColors' as [Table],* from @AddedColors

    -- confirm id/color didn't change

    if exists ( select 1

    from @ColorsToAdd ca

    join @AddedColors ac on ac.RowID = ca.rowid

    where AddedColor <> ColorToAdd )

    begin

    select 'Sequence has been lost' as [Error]

    end

    else

    begin

    select top (100) ca.*, ac.AddedColorID, ac.AddedColor

    from @ColorsToAdd ca

    join @AddedColors ac on ca.RowID = ac.RowID

    end

    -- From here, another table will be updated by matching on RandomGuid

    -- and updating the target column with the value of AddedColorID

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • This is the ol' batch Invoice/Invoice detail problem. I've tried many different renditions, including the very one that you've just tried and I'm here to tell you that it will eventually fail, like it did for me.

    The only way to do this in a guaranteed fashion is to add the GUIDs as a part of the insert to the IDENTITY table. Of course, that also means adding a column to hold the GUID but, if you need to do such things in a batch fashion other than RBAR, then adding the GUID column is absolutely the only safe way to do it.

    If you can't add the column, then you're going to have to resort to RBAR.

    I'm not the only one that thinks this. Conor Cunningham (the T_SQL "god" from Microsoft) has made himself very clear on this subject more than once on MVP email... it is NOT guaranteed that the order from OUTPUT means anything.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Found a compromise that looks to be bulletproof. The target table for the inserts is going to get huge, so we didn't want to store GUIDs in it for a one-time use, but we can probably live with adding two bytes for a smallint. (The insert batches will never be over 20,000 rows at a time, so a smallint will serve.)

    Once that column is added, we just insert the original RowID, so it can now be included in the OUTPUT clause and used to join back to the original insert data.

    Another potential solution would be to use a Sequence object instead of identity column in the target insert table. We rejected that because a lot of developers are using Entity Framework and our ability to enforce their use of stored procedures (or the proper sequence object) was questionable.

    Thanks for the timely feedback, Jeff.

    CREATE TABLE #COLORS (ColorID int IDENTITY(1,1) PRIMARY key, Color varchar(20), NefariousID SMALLINT)

    -- entity framework inserts

    INSERT INTO #COLORS( Color )

    SELECT 'Black' UNION ALL

    SELECT 'White'

    DECLARE @ColorsToAdd Table (RowID smallint identity(1,1) primary key, RandomGUID uniqueIdentifier, ColorToAdd varchar(20))

    DECLARE @AddedColors Table (AddedColorID int, AddedColor varchar(20), NefariousID SMALLINT PRIMARY KEY)

    insert into @ColorsToAdd(RandomGUID, ColorToAdd)

    select NewID(),'Pink' union all

    select NewID(),'Aqua' union all

    select NewID(),'Teal' union all

    select NewID(),'Gold'

    select '#Colors' as [Table],*

    FROM #COLORS

    select '@ColorsToAdd' as [Table],*

    FROM @ColorsToAdd

    insert into #colors (Color, NefariousID)

    output inserted.ColorID, inserted.Color, inserted.NefariousID

    into @AddedColors(AddedColorID, AddedColor, NefariousID)

    select ColorToAdd, RowID

    from @colorsToAdd

    order by RowID -- DOES THIS CONTRL THE SEQUENCE OF THE OUTPUT CLAUSE ???

    select '@AddedColors' as [Table],* from @AddedColors

    -- confirm id/color didn't change

    if exists ( select 1

    from @ColorsToAdd ca

    join @AddedColors ac on ac.NefariousID = ca.rowid

    where AddedColor <> ColorToAdd )

    begin

    select 'Sequence has been lost' as [Error]

    end

    else

    begin

    select top (100) ca.*, ac.AddedColorID, ac.AddedColor

    from @ColorsToAdd ca

    join @AddedColors ac on ca.RowID = ac.NefariousID

    end

    -- From here, another table will be updated by matching on RandomGuid

    -- and updating the target column with the value of AddedColorID

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • 1) you asked to not be lectured on GUIDs, so I won't do that.

    2) I WILL lecture you on the design choice of doing a partial insert and then updating the same rows after the fact. Horribly bad choice because it immediately makes every row larger than it was (and also has a NULL column with it's own overhead), leading to page splits galore. I tell devs to jump through hoops to avoid doing this.

    3) Can't this be done by simply selecting the just-added rows from the colors table? Note that depending on your concurrency needs you may need to be very rigorous with your isolation level/locking hints to avoid collisions.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • I am insulted that you would dare accuse me of such a travesty!! :w00t:

    Just kidding.

    We aren't updating the same table after an insert. We are inserting into table A, and then updating table B.

    Sorry if my explanation of the problem wasn't clear enough. I went back and changed the original post to try to clarify it.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • The Dixie Flatline (7/7/2016)


    I am insulted that you would dare accuse me of such a travesty!! :w00t:

    Just kidding.

    We aren't updating the same table after an insert. We are inserting into table A, and then updating table B.

    Sorry if my explanation of the problem wasn't clear enough. I went back and changed the original post to try to clarify it.

    HAH - good one!

    So then the question becomes can you hold off creating the B-table rows until you have acquired all that you need to populate them without doing a post-insert update?

    And what about getting the data you need from the inserted-to table after the insert? Assuming you can't create duplicates then this would be an easy (efficient index seek??) thing to do. If you can create dupes you need some form of "get the latest of each value" predicate added into the mix.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • If you change your INSERT to a MERGE, then you can just OUTPUT the original GUID. Of course, the MERGE introduces it's own problems.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • drew.allen (7/7/2016)


    If you change your INSERT to a MERGE, then you can just OUTPUT the original GUID. Of course, the MERGE introduces it's own problems.

    Drew

    I'm not following you, Drew. How is the output clause different with a MERGE?

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • TheSQLGuru (7/7/2016)


    So then the question becomes can you hold off creating the B-table rows until you have acquired all that you need to populate them without doing a post-insert update?

    And what about getting the data you need from the inserted-to table after the insert? Assuming you can't create duplicates then this would be an easy (efficient index seek??) thing to do. If you can create dupes you need some form of "get the latest of each value" predicate added into the mix.

    Let me try this again:

    The procedure is being passed a table-valued parameter. In my example above, this corresponds to @ColorsToAdd.

    The procedure will insert some data from this tvparm into Table A (#Colors in my example).

    -- The inserts can create duplicate values for all columns except the identity/primary key column.

    Subsequently, a column in Table B needs to be updated with the new identity value(s) from Table A (#colors.[ColorID]).

    The join to Table B must based on the GUID value from @ColorsToAdd. We do NOT want to store that GUID in #Colors.

    The problem is to get the new identity values from the #Colors inserts associated back with the GUIDs from @ColorsToAdd.

    Once that is accomplished we can proceed to the update query.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • I've updated your most recent test script. The MERGE statement allows more flexibility in the OUTPUT clause than other DML statements. This allows the RowID to be passed from one table variable to another, while being associated with the identity value, and not having to be stored in a permanent table.

    CREATE TABLE #COLORS (ColorID int IDENTITY(1,1) PRIMARY key, Color varchar(20))

    -- entity framework inserts

    INSERT INTO #COLORS( Color )

    SELECT 'Black' UNION ALL

    SELECT 'White'

    DECLARE @ColorsToAdd Table (RowID smallint identity(1,1) primary key, RandomGUID uniqueIdentifier, ColorToAdd varchar(20))

    DECLARE @AddedColors Table (AddedColorID int, AddedColor varchar(20), NefariousID SMALLINT PRIMARY KEY)

    insert into @ColorsToAdd(RandomGUID, ColorToAdd)

    select NewID(),'Pink' union all

    select NewID(),'Aqua' union all

    select NewID(),'Teal' union all

    select NewID(),'Gold'

    select '#Colors' as [Table],*

    FROM #COLORS

    select '@ColorsToAdd' as [Table],*

    FROM @ColorsToAdd

    MERGE #colors as tgt

    USING (select ColorToAdd, RowID from @colorsToAdd) AS src

    ON 1 = 0

    WHEN NOT MATCHED THEN

    INSERT (Color) VALUES (src.ColorToAdd)

    output inserted.ColorID, inserted.Color, src.RowID

    into @AddedColors(AddedColorID, AddedColor, NefariousID);

    select '@AddedColors' as [Table],* from @AddedColors

    -- confirm id/color didn't change

    if exists ( select 1

    from @ColorsToAdd ca

    join @AddedColors ac on ac.NefariousID = ca.rowid

    where AddedColor <> ColorToAdd )

    begin

    select 'Sequence has been lost' as [Error]

    end

    else

    begin

    select top (100) ca.*, ac.AddedColorID, ac.AddedColor

    from @ColorsToAdd ca

    join @AddedColors ac on ca.RowID = ac.NefariousID

    end

    -- From here, another table will be updated by matching on RandomGuid

    -- and updating the target column with the value of AddedColorID

  • The Dixie Flatline (7/7/2016)


    drew.allen (7/7/2016)


    If you change your INSERT to a MERGE, then you can just OUTPUT the original GUID. Of course, the MERGE introduces it's own problems.

    Drew

    I'm not following you, Drew. How is the output clause different with a MERGE?

    With an INSERT, you can only reference the pseudo-tables INSERTED and DELETED, but with MERGE you can also access any of the other tables involved in the query.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Drew, that sounds like the answer to a maiden's prayer, and I'd like to remember it for future reference, but after reading your note I looked at the MSDN pages for both OUTPUT and MERGE and didn't find an example.

    Could you post an example or point me to a URL online?

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Found one: http://sqlblog.com/blogs/rob_farley/archive/2012/06/12/merge-gives-better-output-options.aspx

    Stefanie, not sure why I missed reading your message with the sample code I wanted. But thanks to you as well.

    Thanks to you all.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • The Dixie Flatline (7/7/2016)


    Drew, that sounds like the answer to a maiden's prayer, and I'd like to remember it for future reference, but after reading your note I looked at the MSDN pages for both OUTPUT and MERGE and didn't find an example.

    Could you post an example or point me to a URL online?

    Yes, the MSDN documentation on it isn't very clear, but you can find it on the OUTPUT Clause page in the section on Arguments.

    INSERTED


    Is a column prefix that specifies the value added by the insert or update operation. Columns prefixed with INSERTED reflect the value after the UPDATE, INSERT, or MERGE statement is completed but before triggers are executed.

    from_table_name


    Is a column prefix that specifies a table included in the FROM clause of a DELETE, UPDATE, or MERGE statement that is used to specify the rows to update or delete.

    Also the examples include using from_table_name in an UPDATE (E) and DELETE (F). Since MERGE is a combination of INSERT/UPDATE/DELETE, it has characteristics of all of them, which is probably why a MERGE-specific example wasn't included.

    Drew

    Edited to include information on examples.

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Yes, the MSDN documentation on it isn't very clear, but you can find it on the OUTPUT Clause page in the section on Arguments.

    I see it in the syntax section.

    from_table_name


    Is a column prefix that specifies a table included in the FROM clause of a DELETE, UPDATE, or MERGE statement that is used to specify the rows to update or delete.

    But not an INSERT (sigh).

    Also the examples include using from_table_name in an UPDATE (E) and DELETE (F). Since MERGE is a combination of INSERT/UPDATE/DELETE, it has characteristics of all of them, which is probably why a MERGE-specific example wasn't included.

    Makes sense. Using a fromtable qualifier didn't work when we originally wrote the INSERT. I'm curious what is so different about that, but we have our answer now.

    Thanks again, Drew.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply