Is there a way to do this without a cursor/loop to process individual records?

Question

Post reply

Is there a way to do this without a cursor/loop to process individual records?

Chad Carter

Ten Centuries

Points: 1089
More actions
January 4, 2011 at 6:39 pm

#232971

I am working with a database from a 3rd party. We need to be able to insert into one of their tables to process records but have a requirement to get a sequence number for each record using their process. To handle sequence numbers, they have a table created that only has 1 column in it that is setup as an IDENTITY column.

I want to be able to insert any number of records into a table for processing but need to append each record with a sequence number as noted above. I have written this as a cursor below so you can see what it is I need to do.

Is there a better way to do this though rather than record by record? The code below will create a simple replica of this system. Currently, this process is on SQL 2005 so I am posting it here but it will be moving soon to SQL 2008 R2. I mention this in case SQL 2008 R2 offers something that SQL 2005 currently does not.

-- Drop tables/procs if they exist IF EXISTS (SELECT * FROM dbo.sysobjects WHERE id = OBJECT_ID(N'[dbo].[SEQUENCE]') AND OBJECTPROPERTY(id, N'IsUserTable') = 1) DROP TABLE [dbo].[SEQUENCE] IF EXISTS (SELECT * FROM dbo.sysobjects WHERE id = OBJECT_ID(N'[dbo].[SOURCE]') AND OBJECTPROPERTY(id, N'IsUserTable') = 1) DROP TABLE [dbo].[SOURCE] IF EXISTS (SELECT * FROM dbo.sysobjects WHERE id = OBJECT_ID(N'[dbo].[DESTINATION]') AND OBJECTPROPERTY(id, N'IsUserTable') = 1) DROP TABLE [dbo].[DESTINATION] IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[new_sequence]') AND type in (N'P', N'PC')) DROP PROCEDURE [dbo].[new_sequence] GO -- 3rd party table I am required to use to get sequence values CREATE TABLE [dbo].[SEQUENCE]( [retident] [int] IDENTITY(1,1) NOT NULL, CONSTRAINT [PK_SEQUENCE] PRIMARY KEY CLUSTERED ( [retident] ASC ) ) ON [PRIMARY] GO -- 3rd party PROC. I am not required to use it. I just need to do the same functionality with SEQUENCE table CREATE PROCEDURE [dbo].[new_sequence] @retident INT OUTPUT AS INSERT INTO SEQUENCE DEFAULT VALUES SET @retident = SCOPE_IDENTITY() GO -- For simplicity, my SOURCE table will only have 1 column of data to insert into DESTINATION table CREATE TABLE SOURCE ( Data1 VARCHAR(5) NOT NULL) GO INSERT INTO SOURCE SELECT 'A' UNION ALL SELECT 'B' UNION ALL SELECT 'C' SELECT * FROM SOURCE -- The Destination Table will combine the data from SOURCE with the sequence value provided from SEQUENCE CREATE TABLE DESTINATION ( Data1 VARCHAR(5) NOT NULL ,seqvalue INT NOT NULL) GO -- Cursor Method -- I am able to accomplish this with a cursor but I would prefer not to. -- I can be inserting several hundred records at a time and don't want to process them one by one. -- But I don't know how to work with the SEQUENCE table to get the values out of it any other way. DECLARE @Data1 VARCHAR(5) ,@Seqvalue INT DECLARE INSERTDATA_CURSOR CURSOR FOR SELECT Data1 FROM dbo.SOURCE OPEN INSERTDATA_CURSOR FETCH INSERTDATA_CURSOR INTO @Data1 WHILE (@@FETCH_STATUS = 0) BEGIN EXEC new_sequence @Seqvalue OUTPUT INSERT INTO DESTINATION VALUES (@Data1, @Seqvalue) FETCH NEXT FROM INSERTDATA_CURSOR INTO @Data1 END CLOSE INSERTDATA_CURSOR DEALLOCATE INSERTDATA_CURSOR SELECT * FROM DESTINATION

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

WayneS SSC Guru Points: 95405 More actions · Answer 1

How's this?

DECLARE @Output TABLE (Data1 varchar(5), seqvalue int);



BEGIN TRANSACTION;

BEGIN TRY;

  WITH CTE AS

  (

  SELECT Data1,

         RN = ROW_NUMBER() OVER (ORDER BY Data1)

    FROM Source

  )

  -- use TABLOCKX to lock table, prevent other users from accessing it

  -- perform cross-join against the max value of the Sequence tble

  INSERT INTO Destination (Data1, seqvalue)

  OUTPUT inserted.Data1, inserted.seqvalue INTO @Output

  SELECT CTE.Data1, CTE.RN + s.retident

    FROM CTE

         CROSS JOIN (SELECT retident = max(retident)

                       FROM Sequence WITH (TABLOCKX)) s;

  -- immediately add the new seqvalue values into the Sequence table.

  SET IDENTITY_INSERT Sequence ON;

  INSERT INTO Sequence WITH (TABLOCKX) (retident)

  SELECT seqvalue FROM @Output ORDER BY seqvalue;

  SET IDENTITY_INSERT Sequence OFF;

  -- if reach this point, there was no PK violation on the Sequence table

  -- which could potentially happen in a multi-user situation.

  IF @@TRANCOUNT > 0 COMMIT TRANSACTION;

END TRY

BEGIN CATCH;

  -- roll back all activity

  IF @@TRANCOUNT > 0 ROLLBACK TRANSACTION;

END CATCH

SELECT * from Sequence;

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

LutzM SSC Guru Points: 107049 More actions · Answer 2

Did you talk to the vendor regarding your attemt? They might not provide support as soon as you start bypassing their software...

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Chad Carter Ten Centuries Points: 1089 More actions · Answer 3

Lutz,

We are actually required to post data into some of their tables for processing. The problem I am now trying to resolve is that the person who originally wrote this used their own arbitrary sequence generation for the records that are being inserted. We have determined that if we continue doing this, we will eventually run into problems. Because of this, we still need to insert data into the tables but we need to start using the sequence number that the 3rd party application expects us to.

Wayne,

Thank you for your quick response. Your approach seems reasonable but I worry about the TableLock that has to be placed on the SEQUENCE table to do this. This application is medium to high transaction and they rely on getting values from this table for just about every transaction that takes place. I am not sure I can afford to lock the table for any prolong period of time. With the little simplified test, it is all sub-second but results could be different when working with the real data. I also do not know how the application processes would handle it if they could not get a sequence number when they need it.

I will have to think about this further.

The vendors code is very much row based. From what I have seen in running Profiler against it, they use Cursors for almost everything. Because of this, there approach would be to have us do the same so that we can get the sequence numbers one at a time just like they do. I am hoping for a solution that that would allow me to do set-based without affecting the current application processes. Any thoughts on how to do this without table locks?

LutzM SSC Guru Points: 107049 More actions · Answer 4

Understood. Just wanted to make sure.

Regarding your performance concern:

As long as you're not running this code with hundred thousand of rows from your source table during peak hours, you should be fine.

I'd use this code with batches of about 1000 to 5000 rows during non-peak hours (<75% max. peak). That should have minimal impact.

If the vendor software is c.u.r.s.o.r. *cough* based anyway, their software most probably won't even notice.

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

WayneS SSC Guru Points: 95405 More actions · Answer 5

The reason for those table locks is to avoid assigning the same number to two different processes at once. Since what I wrote is set-based, it's working outside of their row-by-row process. To do anything in a set-based manner, I don't see a way to make it work without locking the table while numbers are being assigned for this set-based update.

As Lutz mentioned, when running this against a small set of data, you should have sub-second response anyway, and it's doubtful that their RBAR process would even notice. However, this is something that would need to be tested.

How many rows are you typically inserting?

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

Chad Carter Ten Centuries Points: 1089 More actions · Answer 6

Chad Carter

Ten Centuries

Points: 1089

January 6, 2011 at 8:58 am

#1270154

The most I have seen at one time is a couple of hundred records so we should be OK. I will begin testing with it and getting approval from then vendor so that they are aware.

Thanks for all of your help with this.

Chad

LutzM SSC Guru Points: 107049 More actions · Answer 7

Chad Carter (1/6/2011)

The most I have seen at one time is a couple of hundred records so we should be OK. I will begin testing with it and getting approval from then vendor so that they are aware.

Thanks for all of your help with this.

Chad

Unless you're doing it with a Pentium I processor and a 200rpm drive you should be fine 😀

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

WayneS SSC Guru Points: 95405 More actions · Answer 8

Chad Carter (1/6/2011)

The most I have seen at one time is a couple of hundred records so we should be OK. I will begin testing with it and getting approval from then vendor so that they are aware.

Thanks for all of your help with this.

Chad

A couple of hundred records should only take a few milliseconds - well within the wait limits before things will time out and cause problems.

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

tfifield SSCrazy Eights Points: 9655 More actions · Answer 9

Chad,

There's a way to do this without table locking and such if you can add a dummy column the the Sequence table. (Unless there is some way to use DEFAULT VALUES in a SELECT statement that I don't know about - please somebody correct me if there is a way to do it.)

If the 3rd party vendor will let you just change the Sequence table to look like this:



CREATE TABLE [dbo].[SEQUENCE](

        [retident] [int] IDENTITY(1,1) NOT NULL,

         DummyChar  CHAR(1),

 CONSTRAINT [PK_SEQUENCE] PRIMARY KEY CLUSTERED

(

        [retident] ASC

)

) ON [PRIMARY]

You want to create a snapshot of what you are about to insert. This could be the entire data or just key values that you can use to join back to the Source. The idea of the snapshot is so you never end up with someone concurrently inserting into the source after you have generated sequence numbers so that the number of rows inserted will always be correct.

Create a temp table of the insert:



SELECT

  RecID = IDENTITY (INT, 1, 1)

, Data1

INTO #S

FROM [SOURCE]

Create another temp table to capture the Identities inserted into

the main Sequence table. You can use a # temp table or table variable - I'm using a table variable here.



DECLARE @Tbl TABLE

( SeqValue  INT

, RecID     INT IDENTITY(1, 1) -- This lines up with the temp table

)

Generate the sequence numbers from the main Sequence table by inserting into the DummyChar column and capture the Identity.



INSERT INTO SEQUENCE

( DummyChar )

OUTPUT inserted.retident INTO @Tbl

SELECT 'X'

FROM #S

ORDER BY RecID

You now have #S with what you are going to insert and @Tbl with the generated Identities and they relate by RecID.



INSERT INTO DESTINATION

( Data1, seqvalue )

SELECT S.Data1, T.SeqValue

FROM #S S

INNER JOIN @Tbl T ON

   S.RecID = T.RecID

This should avoid any table locking and concurrency issues.

Todd Fifield

Chad Carter Ten Centuries Points: 1089 More actions · Answer 10

Chad Carter

Ten Centuries

Points: 1089

January 10, 2011 at 8:37 am

#1271445

That is good to know as well. That was a big factor limiting me on what I could do trying to work with a table with nothing other than the identity column. I don't see me being able to get a dummy column added to the table but I can always ask.

Thanks!

Chad

tfifield SSCrazy Eights Points: 9655 More actions · Answer 11

Chad,

I would try to politely embarrass the 3rd party into allowing a dummy column in the sequence table. After all, a table with a single Identity column is pretty dumb (I did say politely).

About the only reason I could think of to not allow another column would be equally dumb. That would be something like:



CREATE VIEW DumbView

AS

SELECT SEQUENCE.*, DATA.*

FROM SEQUENCE

INNER JOIN DATA ON

   SEQUENCE.SEQUENCE_NO = DATA.SEQUENCE_NO

Todd Fifield