Use of Identity Property to Resolve Concurrency Issues

Question

Post reply

Use of Identity Property to Resolve Concurrency Issues

Viewing 15 posts - 46 through 60 (of 63 total)

You must be logged in to reply to this topic. Login to reply

Ajit Ananthram SSC Veteran Points: 211 More actions · Answer 1

Solomon Rutzky (2/26/2011)

Ajit, this is an interesting approach to the problem, but most likely it is just adding complication via over-engineering when a much simpler solution appears to be available.

1) The most important and easiest thing to do is, as "dvdwouwe 72544" mentioned, create a PK on the "key" column. There is really no reason to have this table without a PK defined. This will also ensure that a duplicate "key" doesn't accidentally get added to the table.

2) You don't need to worry about the SELECT of the new value getting something updated from another session if you use the OUTPUT clause in the UPDATE as follows:

UPDATE tbl_kvp SET column_value += 1 OUTPUT inserted.column_value WHERE column_key = @key

This avoids the need for additional locking as "tech.dbmeyer" had suggested.

3) You can always tell the system to only lock a single row (although given the singular nature of the UPDATE statement I am not sure why it would ever do anything else, but still) by using the WITH (ROWLOCK) hint in the UPDATE as "SanDroid" mentioned.

Minor note: I find it interesting / amusing that whoever created this table (a developer I think you said) prefixed the column names with "column_". 😉

Thanks for your feedback Solomon.

Even with the primary key in place, if two user sessions would attempt to increment the same key as part of their own transactions, one session would end up blocking the other.

Ideally, incrementing the KVP value should be done outside transactions, but applications developers seldom adhered to this principle. As I've said in previous posts, the main goals for me were to eliminate this blocking, and also make the incrementing process independent of transactions.

By making the single horizontal record for a given key vertical, transactions could increment the KVP and obtain appropriate values without blocking each other.

Ajit Ananthram
Blog - http://ajitananthram.wordpress.com

Paul White SSC Guru Points: 150457 More actions · Answer 2

Solomon Rutzky (2/26/2011)

...You don't need to worry about the SELECT of the new value getting something updated from another session if you use the OUTPUT clause in the UPDATE as follows:

UPDATE tbl_kvp SET column_value += 1 OUTPUT inserted.column_value WHERE column_key = @key

This avoids the need for additional locking as "tech.dbmeyer" had suggested.

Yes, but the problem is that the key-allocation procedure might be called within a transaction. The exclusive row lock needed by the statement above will be held to the end of that transaction. An application that starts a transaction, allocates a key from this procedure, and then sits around waiting for something, will block anyone else needing the next key from the same sequence.

I wrote about this issue at length (including robust ways to pre-allocate a range of keys without blocking) in:

http://sqlblog.com/blogs/paul_white/archive/2010/10/19/sequence-tables.aspx

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

Paul White SSC Guru Points: 150457 More actions · Answer 3

It is possible to avoid the dummy column, ever-growing number of rows in the hidden tables, and avoid locking issues by using an approach based on this example code: (the example just uses a single range, but it is easy to extend the idea)



USE tempdb;

GO

-- Drop the table if it exists from a previous run

IF      OBJECT_ID(N'dbo.SneakyIdentity', N'U') IS NOT NULL

BEGIN

        DROP TABLE dbo.SneakyIdentity;

END;

GO

-- Create the hidden table used to allocate IDs

CREATE TABLE dbo.SneakyIdentity (row_id BIGINT IDENTITY NOT NULL);

GO



-- The allocation routine

CREATE  PROCEDURE dbo.Allocate

        (

        @ID     BIGINT OUTPUT

        )

AS

BEGIN

        -- No DONE_IN_PROC messages

        SET     NOCOUNT ON

        ;

        -- Almost all errors will abort the batch

        SET     XACT_ABORT ON

        ;

        -- Table variable used with the OUTPUT clause

        -- to safely read the identity value assigned.

        -- Issues (bugs) have arisen in the past with

        -- @@IDENTITY, SCOPE_IDENTITY, and IDENT_CURRENT.

        DECLARE @Output TABLE

                (row_id BIGINT NOT NULL)

        ;

        -- Start error handler

        BEGIN   TRY

            -- Might not be the top-level transaction!

            BEGIN   TRANSACTION;

            -- Save point

            SAVE    TRANSACTION AllocateID;

            -- Allocate an ID using the 'hidden' table

            INSERT  dbo.SneakyIdentity

            OUTPUT  inserted.row_id

            INTO    @Output (row_id)

                    DEFAULT VALUES;

            -- Rollback to the save point

            ROLLBACK TRANSACTION AllocateID;

            -- Correct the transaction count

            COMMIT  TRANSACTION;

        END     TRY

        BEGIN   CATCH

            -- Remember the error message

            DECLARE @ErrMsg NVARCHAR(2048);

            SET     @ErrMsg = ERROR_MESSAGE();

            -- Uncommittable transaction => unconditional rollback

            -- (this is the only option)

            IF  XACT_STATE() = -1 ROLLBACK TRANSACTION;

            -- If our transaction is still alive, roll back to

            -- the save point, and adjust @@TRANCOUNT

            IF  XACT_STATE() = 1

            BEGIN

                -- Rollback to the save point

                ROLLBACK TRANSACTION AllocateID;

                -- Correct the transaction count

                COMMIT  TRANSACTION;

            END

            -- Custom error handling goes here

            RAISERROR('Error in allocation: %s', 16, 1, @ErrMsg);

            -- Error

            RETURN 999;

    END     CATCH;

    -- Capture assigned value

    -- (Table variables are not affected by transaction rollbacks)

    SET     @ID =

            (

            SELECT  TOP (1)

                    row_id

            FROM    @Output

            );

    -- Success

    RETURN  0

    ;

END;

GO



-- === TEST ===

-- Variables

DECLARE @ID BIGINT,

        @rc INTEGER;

;

-- Simulate an enclosing user transaction

BEGIN   TRANSACTION

;

-- Allocate the next id

EXECUTE @rc =

        dbo.Allocate @ID OUTPUT

;

-- Show locks

SELECT  *

FROM    sys.dm_tran_locks AS TL

WHERE   TL.request_session_id = @@SPID

;

-- Show procedure return code and allocated ID

SELECT  return_code = @rc,

        allocated_id = @ID

;

-- Now roll back or commit

ROLLBACK

;

-- Uncomment this when finished

-- DROP  TABLE dbo.SneakyIdentity;

-- DROP  PROCEDURE dbo.Allocate;

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

James Goodwin Hall of Fame Points: 3691 More actions · Answer 4

That's a fairly nice solution to the problem, however...(There's always a but...)

The biggest problem I see with your approach is that when someone wants to add a new key-value pair they also need to add a new table. I dislike needing to alter the design of the database just because someone wants to add a data value.

Therefore: Wouldn't it be easier to use a single nextnum table instead of one for each key? Your get sequence number code would work the same and you wouldn't need to use dynamic SQL.

--

JimFive

UMG Developer SSChampion Points: 13482 More actions · Answer 5

James Goodwin (2/28/2011)

That's a fairly nice solution to the problem, however...(There's always a but...)

The biggest problem I see with your approach is that when someone wants to add a new key-value pair they also need to add a new table. I dislike needing to alter the design of the database just because someone wants to add a data value.

Therefore: Wouldn't it be easier to use a single nextnum table instead of one for each key? Your get sequence number code would work the same and you wouldn't need to use dynamic SQL.

That is what they started with but they had locking and blocking issues while updating that one nextnum table to obtain the next key value.

I suspect that they would only add a new key when they add a new table, so instead of just adding one table you add two. Is the solution perfect? No, but it fixed the problem they were encountering without having to do a major redesign of the database or the application.

James Goodwin Hall of Fame Points: 3691 More actions · Answer 6

That is what they started with but they had locking and blocking issues while updating that one nextnum table to obtain the next key value.

I think I was unclear.

I am suggesting that instead of one dummy table for each key-value, you create a single dummy table and grab the next identity value from it just as in your solution. The only difference is that the dummy table is shared amongst all of the key value pairs. This may, of course, leave gaps in the sequence for each key-value but you have stated that that doesn't matter in this case.

--

JimFive

Vladimir Sotirov SSC Eights! Points: 921 More actions · Answer 7

Grigore Dolghin (2/25/2011)

I am afraid this approach is even worse. What if for each of those keys the developer has to insert some records in a related table, then use those keys to insert related records in a third table? and what if those keys are actually autoincrement values and can't be obtained beforehand? and what if the values have to be generated taking in account the other possible users working in the same time? (such as secvential invoice numbers?)

Locking must be implemented on database side, period.

Please do not label solutions as worst if you do not understand them. What I suggested is that you still keep the locking on the database side, but do not request one key at a time. Any application can request 100 keys at a time and cache them on the application side. I do not see the connection between what I suggested and using the keys in multiple tables. If you must have all the entries in sequential order and can not tolerate missing keys than what I suggest is not usable. By the way in Oracle sequences can have incremental different than 1 same as identity columns in SQL Server.

stephen.lear SSC Journeyman Points: 84 More actions · Answer 8

You are correct, and I agree. Identity columns should not be used for business data in that way. My point was that the transaction responsible for assigning the "gapless" sequence in the original example SP would achive this, but the function as re-written would not. I would normally expect to assign such numbers in a batch process and allocate them all or not at the end of the day. In some cases, we see internet buisinesses where the (for example) invoice number has to be applied immidiatly to allow a web based order processing system, coulpled with the business requiremet for no gaps in the sequence of tax invoice numbers. In this specific case, the original stored proceedure works, but the re-write may not.

nycgags Grasshopper Points: 11 More actions · Answer 9

peter.row (2/24/2011)

Why use the output syntax, table variable and dynamic sql?

Wouldn't the following be simpler (no permissions issue on table if user running it only has access to SP either) :

insert into ...

set @outputId = scope_identity()

This was my original thought as well, maybe I am missing something.

Ajit Ananthram SSC Veteran Points: 211 More actions · Answer 10

nycgags (3/22/2013)

peter.row (2/24/2011)

Why use the output syntax, table variable and dynamic sql?

Wouldn't the following be simpler (no permissions issue on table if user running it only has access to SP either) :

insert into ...

set @outputId = scope_identity()

This was my original thought as well, maybe I am missing something.

The scope_identity function has been known to generate incorrect values at times (when queries make use of parallel execution plans). Refer to this link for more information -->

Ajit Ananthram
Blog - http://ajitananthram.wordpress.com

NBSteve Hall of Fame Points: 3744 More actions · Answer 11

Paul White (2/26/2011)

It is possible to avoid the dummy column, ever-growing number of rows in the hidden tables, and avoid locking issues by using an approach based on this example code: (the example just uses a single range, but it is easy to extend the idea)

--really cool code from page 5 snipped

Wow, I stared at this code for a good 30 minutes, just admiring the beauty and cleverness and (unsuccessfully) trying to find a flaw with it. I'm sad that I'm discovering this 2 years late, but this is a really cool (and yes, sneaky) trick. Paul, have you ever made a separate blog post detailing this technique? (Or know of a similar one by someone else?)

I don't have anything new to add, but want to rehash some of the things going on just to clarify for anybody that's confused or missing the point.

The OP's original problem was that he's using a stored procedure to get a key value from another table and seeing blocking because related calls to that stored procedure are trying to access a value from the same row. Note that even if the stored procedure itself doesn't leave an open transaction locking the rows, it's possible that the calling routine, which the DBA may or may not have control over, is calling the key procedure inside of a long running transaction, leaving long-standing locks in place despite the speed or intentions of the actual sproc. The OP's solution to this is to ensure that, rather than having parallel processes calling the sproc accessing the same row (which leads to the potential blocking), each call will create a new row in the table so that as long as only row locks are used, multiple calls will never block one another. The primary drawbacks of this are the potential need for many tables and the creation of many rows of data which are never used after their initial creation.

Paul's code doesn't directly address the problem of multiple different applications, although one could easily do the same as the OP and create a new table for each application and still use Paul's trick. One of the cool things about Paul's technique though is that it doesn't result in many rows... in fact, it doesn't result in ANY rows! He's essentially using the table's meta-data to track the keys rather than using the table data itself. And, he's doing it without ANY locking issues at all. (Yes, there's probably an instantaneous lock, but it won't get held even in a long-running transaction.)

So what's going on in Paul's code that makes it so sneaky? There's a few things, so I'll start from the middle and work out. The heart of it is the INSERT, which uses an OUTPUT clause to save the new IDENTITY value automatically generated by the table (note that, as others have posted, this OUTPUT technique is safer than using SCOPE_IDENTITY() or related functions). The OUTPUT saves the value in a table variable which is critical because table variables do not participate in transactions.

This becomes important because the INSERT is nested inside a SAVE/ROLLBACK transaction. I'm guessing everybody who reads this is fully aware of BEGIN/COMMIT/ROLLBACK transactions, although I do find that many people don't realize how nested transactions work. Transactions can be named to help clarify nesting levels, however, these names have NO effect on COMMIT/ROLLBACK calls. COMMIT will always commit the innermost transaction, while ROLLBACK will always rollback the outermost transaction, including all nested transactions. The exception to this is when creating a SAVE point, as Paul does. A ROLLBACK can not be limited to a nested BEGIN TRANSACTION, but it can be limited to a SAVE TRANSACTION point.

So what's happening here is Paul creates a SAVE point right before the insert, then inserts the data, and then immediately rolls back to the save point. The ROLLBACK undoes the INSERT, which includes releasing any locks that were allocated from the INSERT. However, because table variables are not included in transactions, the IDENTITY value which was OUTPUT into the table variable is still present and available for use. Also, the IDENTITY SEED value from the table's meta data isn't included in the transaction and so stays incremented even through the rollback, so that future calls will continue incrementing it rather than repeating values.

So the net result here is that, because of the save point and rollback, no rows are written to any table and no locks are being allocated. However, we still get back the new identity value, and the table meta data still has a record of what that newest identity value was so that it doesn't repeat. Sneaky indeed.

It's also worth paying attention to how Paul handles the TRY...CATCH block and his COMMIT transactions. Though not directly related to the intent of the post, they're often misunderstood and this is a great example of how to use them properly.

Sorry for the wordiness, but hope this helps clarify things.

Caruncles Hall of Fame Points: 3299 More actions · Answer 12

Wow! you folks have been busy! I'm just reading this at 2:pm EDT and by time I got to the end, I forgot what the original purpose was. I had to go back and re-read some sections to catch up.

We have a mail-order business which also employs a call center and Internet sales. We have from 10-25 people (depending on season) taking orders by phone or from mail and the web runs 24-7. Each entry person uses the same custom OrderEntry program (VB6) which is the front end, SQL '08 is the DB. Each order gets a unique order number. As you can see by the sample of the OrderNumber table below (sorry for alignment), the TYPE of order can determine the ordernumber. Still, if there were only 1 type, it wouldn't matter. Simply put, the calling procedure adds a 1 to the current ordernumber. The next person gets the current ordernumber and adds a 1, etc. etc. the web system pulls from the same table. NOte the column "next availableOrderNumber". In my 2.5 years here, we've never had a collision or duplication and this system has been running since the late 90's. The calling program identifies the type based on location and feeds that as a parameter to the calling sproc. It's just a glorified counter. Sorry if I'm missing the point.

CatalogYear OrderType Range Description NextAvailableOrderNumber OrderNumberRangeLow OrderNumberRangeHigh

2012 C 1 Credit Card Orders 0 400000 499999

2012 C 2 Credit Card Orders w/Cards 0 570000 579999

2012 C 3 Credit Card Orders (FAX) 0 890000 895999

2012 C 4 Credit Card Orders (EMail) 0 580000 589999

2012 C 5 Credit Card Orders (999999) 0 870000 873999

2012 C 6 Credit Card Orders (WEB Site) 0 580000 599999

2012 E 1 Employee Charge 0 899000 899999

2012 R 1 Regular (Check) Orders 0 600000 799999

2012 R 2 Regular (Check) Orders 0 855000 857999

2012 R 3 Regular(Check) Orders (999999) 0 855000 857999

2012 T 1 Telephone Orders 100219 100000 129999

2012 W 1 WalkIn Charge 0 897500 897999

2012 X 1 WalkIn Check 0 897000 897499

2012 Z 1 Employee Check 0 898000 898999

Ajit, I'm more curious about how you scheduled your diagnostic procedure to run and find the blocks. I've copied and run it, but it found nothing, because I don't think anything was actually going on at the time. Did you just run it once, or did you have it repeat periodically, knowing the approximate time the offending procedure would run?

Wallace Houston
Sunnyland Farms, Inc.

"We must endeavor to persevere."

kpmcginn Valued Member Points: 57 More actions · Answer 13

This solution would appear to be a sound approach to resolve the locking issue. However, it seems that perhaps a simpler approach would be to move the call to the function to get the key out of the transaction. Call the function to get the key; if it is null then error out; if it is non-null proceed with transaction processing.

Paul White SSC Guru Points: 150457 More actions · Answer 14

NBSteve (3/22/2013)

Paul White (2/26/2011)

It is possible to avoid the dummy column, ever-growing number of rows in the hidden tables, and avoid locking issues by using an approach based on this example code: (the example just uses a single range, but it is easy to extend the idea)

--really cool code from page 5 snipped

Wow, I stared at this code for a good 30 minutes, just admiring the beauty and cleverness and (unsuccessfully) trying to find a flaw with it. I'm sad that I'm discovering this 2 years late, but this is a really cool (and yes, sneaky) trick. Paul, have you ever made a separate blog post detailing this technique? (Or know of a similar one by someone else?)

Thanks, and no I have never blogged about it - it seemed a bit of a narrow use case to be honest. Your explanation is spot on by the way.

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

Ajit Ananthram SSC Veteran Points: 211 More actions · Answer 15

Ajit, I'm more curious about how you scheduled your diagnostic procedure to run and find the blocks. I've copied and run it, but it found nothing, because I don't think anything was actually going on at the time. Did you just run it once, or did you have it repeat periodically, knowing the approximate time the offending procedure would run?

Thanks for your question. I wrote this article 2 years ago (it was republished yesterday), and back then I used to use the diagnostic script in a job which executed the script in a loop with a pause of a few seconds (i.e. with a WAITFOR DELAY statement) and had some logic to log to a table. The script only captures details of locks that exist when it executes (i.e. at that instant).

If you are looking for a more customisable and optimised version of such a diagnostic script, I'd highly recommend MVP Adam Machanic's sp_whoisactive stored procedure. The latest version of this procedure is v11.11 at the time of writing this comment, and this can be obtained from the following location --> (http://sqlblog.com/blogs/adam_machanic/archive/2012/03/22/released-who-is-active-v11-11.aspx)

Ajit Ananthram
Blog - http://ajitananthram.wordpress.com