Is there a way to do this without a cursor/loop to process individual records?

  • I am working with a database from a 3rd party. We need to be able to insert into one of their tables to process records but have a requirement to get a sequence number for each record using their process. To handle sequence numbers, they have a table created that only has 1 column in it that is setup as an IDENTITY column.

    I want to be able to insert any number of records into a table for processing but need to append each record with a sequence number as noted above. I have written this as a cursor below so you can see what it is I need to do.

    Is there a better way to do this though rather than record by record? The code below will create a simple replica of this system. Currently, this process is on SQL 2005 so I am posting it here but it will be moving soon to SQL 2008 R2. I mention this in case SQL 2008 R2 offers something that SQL 2005 currently does not.

    -- Drop tables/procs if they exist

    IF EXISTS (SELECT * FROM dbo.sysobjects WHERE id = OBJECT_ID(N'[dbo].[SEQUENCE]') AND OBJECTPROPERTY(id, N'IsUserTable') = 1)

    DROP TABLE [dbo].[SEQUENCE]

    IF EXISTS (SELECT * FROM dbo.sysobjects WHERE id = OBJECT_ID(N'[dbo].[SOURCE]') AND OBJECTPROPERTY(id, N'IsUserTable') = 1)

    DROP TABLE [dbo].[SOURCE]

    IF EXISTS (SELECT * FROM dbo.sysobjects WHERE id = OBJECT_ID(N'[dbo].[DESTINATION]') AND OBJECTPROPERTY(id, N'IsUserTable') = 1)

    DROP TABLE [dbo].[DESTINATION]

    IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[new_sequence]') AND type in (N'P', N'PC'))

    DROP PROCEDURE [dbo].[new_sequence]

    GO

    -- 3rd party table I am required to use to get sequence values

    CREATE TABLE [dbo].[SEQUENCE](

    [retident] [int] IDENTITY(1,1) NOT NULL,

    CONSTRAINT [PK_SEQUENCE] PRIMARY KEY CLUSTERED

    (

    [retident] ASC

    )

    ) ON [PRIMARY]

    GO

    -- 3rd party PROC. I am not required to use it. I just need to do the same functionality with SEQUENCE table

    CREATE PROCEDURE [dbo].[new_sequence] @retident INT OUTPUT

    AS

    INSERT INTO SEQUENCE

    DEFAULT VALUES

    SET @retident = SCOPE_IDENTITY()

    GO

    -- For simplicity, my SOURCE table will only have 1 column of data to insert into DESTINATION table

    CREATE TABLE SOURCE (

    Data1 VARCHAR(5) NOT NULL)

    GO

    INSERT INTO SOURCE

    SELECT 'A'

    UNION ALL

    SELECT 'B'

    UNION ALL

    SELECT 'C'

    SELECT * FROM SOURCE

    -- The Destination Table will combine the data from SOURCE with the sequence value provided from SEQUENCE

    CREATE TABLE DESTINATION (

    Data1 VARCHAR(5) NOT NULL

    ,seqvalue INT NOT NULL)

    GO

    -- Cursor Method

    -- I am able to accomplish this with a cursor but I would prefer not to.

    -- I can be inserting several hundred records at a time and don't want to process them one by one.

    -- But I don't know how to work with the SEQUENCE table to get the values out of it any other way.

    DECLARE

    @Data1 VARCHAR(5)

    ,@Seqvalue INT

    DECLARE INSERTDATA_CURSOR CURSOR

    FOR

    SELECT

    Data1

    FROM

    dbo.SOURCE

    OPEN INSERTDATA_CURSOR

    FETCH INSERTDATA_CURSOR INTO @Data1

    WHILE (@@FETCH_STATUS = 0)

    BEGIN

    EXEC new_sequence @Seqvalue OUTPUT

    INSERT INTO DESTINATION

    VALUES (@Data1, @Seqvalue)

    FETCH NEXT FROM INSERTDATA_CURSOR INTO @Data1

    END

    CLOSE INSERTDATA_CURSOR

    DEALLOCATE INSERTDATA_CURSOR

    SELECT * FROM DESTINATION

  • How's this?

    DECLARE @Output TABLE (Data1 varchar(5), seqvalue int);

    BEGIN TRANSACTION;

    BEGIN TRY;

    WITH CTE AS

    (

    SELECT Data1,

    RN = ROW_NUMBER() OVER (ORDER BY Data1)

    FROM Source

    )

    -- use TABLOCKX to lock table, prevent other users from accessing it

    -- perform cross-join against the max value of the Sequence tble

    INSERT INTO Destination (Data1, seqvalue)

    OUTPUT inserted.Data1, inserted.seqvalue INTO @Output

    SELECT CTE.Data1, CTE.RN + s.retident

    FROM CTE

    CROSS JOIN (SELECT retident = max(retident)

    FROM Sequence WITH (TABLOCKX)) s;

    -- immediately add the new seqvalue values into the Sequence table.

    SET IDENTITY_INSERT Sequence ON;

    INSERT INTO Sequence WITH (TABLOCKX) (retident)

    SELECT seqvalue FROM @Output ORDER BY seqvalue;

    SET IDENTITY_INSERT Sequence OFF;

    -- if reach this point, there was no PK violation on the Sequence table

    -- which could potentially happen in a multi-user situation.

    IF @@TRANCOUNT > 0 COMMIT TRANSACTION;

    END TRY

    BEGIN CATCH;

    -- roll back all activity

    IF @@TRANCOUNT > 0 ROLLBACK TRANSACTION;

    END CATCH

    SELECT * from Sequence;

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Did you talk to the vendor regarding your attemt? They might not provide support as soon as you start bypassing their software...



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Lutz,

    We are actually required to post data into some of their tables for processing. The problem I am now trying to resolve is that the person who originally wrote this used their own arbitrary sequence generation for the records that are being inserted. We have determined that if we continue doing this, we will eventually run into problems. Because of this, we still need to insert data into the tables but we need to start using the sequence number that the 3rd party application expects us to.

    Wayne,

    Thank you for your quick response. Your approach seems reasonable but I worry about the TableLock that has to be placed on the SEQUENCE table to do this. This application is medium to high transaction and they rely on getting values from this table for just about every transaction that takes place. I am not sure I can afford to lock the table for any prolong period of time. With the little simplified test, it is all sub-second but results could be different when working with the real data. I also do not know how the application processes would handle it if they could not get a sequence number when they need it.

    I will have to think about this further.

    The vendors code is very much row based. From what I have seen in running Profiler against it, they use Cursors for almost everything. Because of this, there approach would be to have us do the same so that we can get the sequence numbers one at a time just like they do. I am hoping for a solution that that would allow me to do set-based without affecting the current application processes. Any thoughts on how to do this without table locks?

  • Understood. Just wanted to make sure.

    Regarding your performance concern:

    As long as you're not running this code with hundred thousand of rows from your source table during peak hours, you should be fine.

    I'd use this code with batches of about 1000 to 5000 rows during non-peak hours (<75% max. peak). That should have minimal impact.

    If the vendor software is c.u.r.s.o.r. *cough* based anyway, their software most probably won't even notice.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • The reason for those table locks is to avoid assigning the same number to two different processes at once. Since what I wrote is set-based, it's working outside of their row-by-row process. To do anything in a set-based manner, I don't see a way to make it work without locking the table while numbers are being assigned for this set-based update.

    As Lutz mentioned, when running this against a small set of data, you should have sub-second response anyway, and it's doubtful that their RBAR process would even notice. However, this is something that would need to be tested.

    How many rows are you typically inserting?

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • The most I have seen at one time is a couple of hundred records so we should be OK. I will begin testing with it and getting approval from then vendor so that they are aware.

    Thanks for all of your help with this.

    Chad

  • Chad Carter (1/6/2011)


    The most I have seen at one time is a couple of hundred records so we should be OK. I will begin testing with it and getting approval from then vendor so that they are aware.

    Thanks for all of your help with this.

    Chad

    Unless you're doing it with a Pentium I processor and a 200rpm drive you should be fine 😀



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Chad Carter (1/6/2011)


    The most I have seen at one time is a couple of hundred records so we should be OK. I will begin testing with it and getting approval from then vendor so that they are aware.

    Thanks for all of your help with this.

    Chad

    A couple of hundred records should only take a few milliseconds - well within the wait limits before things will time out and cause problems.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Chad,

    There's a way to do this without table locking and such if you can add a dummy column the the Sequence table. (Unless there is some way to use DEFAULT VALUES in a SELECT statement that I don't know about - please somebody correct me if there is a way to do it.)

    If the 3rd party vendor will let you just change the Sequence table to look like this:

    CREATE TABLE [dbo].[SEQUENCE](

    [retident] [int] IDENTITY(1,1) NOT NULL,

    DummyChar CHAR(1),

    CONSTRAINT [PK_SEQUENCE] PRIMARY KEY CLUSTERED

    (

    [retident] ASC

    )

    ) ON [PRIMARY]

    You want to create a snapshot of what you are about to insert. This could be the entire data or just key values that you can use to join back to the Source. The idea of the snapshot is so you never end up with someone concurrently inserting into the source after you have generated sequence numbers so that the number of rows inserted will always be correct.

    Create a temp table of the insert:

    SELECT

    RecID = IDENTITY (INT, 1, 1)

    , Data1

    INTO #S

    FROM [SOURCE]

    Create another temp table to capture the Identities inserted into

    the main Sequence table. You can use a # temp table or table variable - I'm using a table variable here.

    DECLARE @Tbl TABLE

    ( SeqValue INT

    , RecID INT IDENTITY(1, 1) -- This lines up with the temp table

    )

    Generate the sequence numbers from the main Sequence table by inserting into the DummyChar column and capture the Identity.

    INSERT INTO SEQUENCE

    ( DummyChar )

    OUTPUT inserted.retident INTO @Tbl

    SELECT 'X'

    FROM #S

    ORDER BY RecID

    You now have #S with what you are going to insert and @Tbl with the generated Identities and they relate by RecID.

    INSERT INTO DESTINATION

    ( Data1, seqvalue )

    SELECT S.Data1, T.SeqValue

    FROM #S S

    INNER JOIN @Tbl T ON

    S.RecID = T.RecID

    This should avoid any table locking and concurrency issues.

    Todd Fifield

  • That is good to know as well. That was a big factor limiting me on what I could do trying to work with a table with nothing other than the identity column. I don't see me being able to get a dummy column added to the table but I can always ask.

    Thanks!

    Chad

  • Chad,

    I would try to politely embarrass the 3rd party into allowing a dummy column in the sequence table. After all, a table with a single Identity column is pretty dumb (I did say politely).

    About the only reason I could think of to not allow another column would be equally dumb. That would be something like:

    CREATE VIEW DumbView

    AS

    SELECT SEQUENCE.*, DATA.*

    FROM SEQUENCE

    INNER JOIN DATA ON

    SEQUENCE.SEQUENCE_NO = DATA.SEQUENCE_NO

    Todd Fifield

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply