Use of Identity Property to Resolve Concurrency Issues

  • Solomon Rutzky (2/26/2011)


    Ajit, this is an interesting approach to the problem, but most likely it is just adding complication via over-engineering when a much simpler solution appears to be available.

    1) The most important and easiest thing to do is, as "dvdwouwe 72544" mentioned, create a PK on the "key" column. There is really no reason to have this table without a PK defined. This will also ensure that a duplicate "key" doesn't accidentally get added to the table.

    2) You don't need to worry about the SELECT of the new value getting something updated from another session if you use the OUTPUT clause in the UPDATE as follows:

    UPDATE tbl_kvp

    SET column_value += 1

    OUTPUT inserted.column_value

    WHERE column_key = @key

    This avoids the need for additional locking as "tech.dbmeyer" had suggested.

    3) You can always tell the system to only lock a single row (although given the singular nature of the UPDATE statement I am not sure why it would ever do anything else, but still) by using the WITH (ROWLOCK) hint in the UPDATE as "SanDroid" mentioned.

    Minor note: I find it interesting / amusing that whoever created this table (a developer I think you said) prefixed the column names with "column_". 😉

    Thanks for your feedback Solomon.

    Even with the primary key in place, if two user sessions would attempt to increment the same key as part of their own transactions, one session would end up blocking the other.

    Ideally, incrementing the KVP value should be done outside transactions, but applications developers seldom adhered to this principle. As I've said in previous posts, the main goals for me were to eliminate this blocking, and also make the incrementing process independent of transactions.

    By making the single horizontal record for a given key vertical, transactions could increment the KVP and obtain appropriate values without blocking each other.

    Ajit Ananthram
    Blog - http://ajitananthram.wordpress.com

  • Solomon Rutzky (2/26/2011)


    ...You don't need to worry about the SELECT of the new value getting something updated from another session if you use the OUTPUT clause in the UPDATE as follows:

    UPDATE tbl_kvp

    SET column_value += 1

    OUTPUT inserted.column_value

    WHERE column_key = @key

    This avoids the need for additional locking as "tech.dbmeyer" had suggested.

    Yes, but the problem is that the key-allocation procedure might be called within a transaction. The exclusive row lock needed by the statement above will be held to the end of that transaction. An application that starts a transaction, allocates a key from this procedure, and then sits around waiting for something, will block anyone else needing the next key from the same sequence.

    I wrote about this issue at length (including robust ways to pre-allocate a range of keys without blocking) in:

    http://sqlblog.com/blogs/paul_white/archive/2010/10/19/sequence-tables.aspx

  • It is possible to avoid the dummy column, ever-growing number of rows in the hidden tables, and avoid locking issues by using an approach based on this example code: (the example just uses a single range, but it is easy to extend the idea)

    USE tempdb;

    GO

    -- Drop the table if it exists from a previous run

    IF OBJECT_ID(N'dbo.SneakyIdentity', N'U') IS NOT NULL

    BEGIN

    DROP TABLE dbo.SneakyIdentity;

    END;

    GO

    -- Create the hidden table used to allocate IDs

    CREATE TABLE dbo.SneakyIdentity (row_id BIGINT IDENTITY NOT NULL);

    GO

    -- The allocation routine

    CREATE PROCEDURE dbo.Allocate

    (

    @ID BIGINT OUTPUT

    )

    AS

    BEGIN

    -- No DONE_IN_PROC messages

    SET NOCOUNT ON

    ;

    -- Almost all errors will abort the batch

    SET XACT_ABORT ON

    ;

    -- Table variable used with the OUTPUT clause

    -- to safely read the identity value assigned.

    -- Issues (bugs) have arisen in the past with

    -- @@IDENTITY, SCOPE_IDENTITY, and IDENT_CURRENT.

    DECLARE @Output TABLE

    (row_id BIGINT NOT NULL)

    ;

    -- Start error handler

    BEGIN TRY

    -- Might not be the top-level transaction!

    BEGIN TRANSACTION;

    -- Save point

    SAVE TRANSACTION AllocateID;

    -- Allocate an ID using the 'hidden' table

    INSERT dbo.SneakyIdentity

    OUTPUT inserted.row_id

    INTO @Output (row_id)

    DEFAULT VALUES;

    -- Rollback to the save point

    ROLLBACK TRANSACTION AllocateID;

    -- Correct the transaction count

    COMMIT TRANSACTION;

    END TRY

    BEGIN CATCH

    -- Remember the error message

    DECLARE @ErrMsg NVARCHAR(2048);

    SET @ErrMsg = ERROR_MESSAGE();

    -- Uncommittable transaction => unconditional rollback

    -- (this is the only option)

    IF XACT_STATE() = -1 ROLLBACK TRANSACTION;

    -- If our transaction is still alive, roll back to

    -- the save point, and adjust @@TRANCOUNT

    IF XACT_STATE() = 1

    BEGIN

    -- Rollback to the save point

    ROLLBACK TRANSACTION AllocateID;

    -- Correct the transaction count

    COMMIT TRANSACTION;

    END

    -- Custom error handling goes here

    RAISERROR('Error in allocation: %s', 16, 1, @ErrMsg);

    -- Error

    RETURN 999;

    END CATCH;

    -- Capture assigned value

    -- (Table variables are not affected by transaction rollbacks)

    SET @ID =

    (

    SELECT TOP (1)

    row_id

    FROM @Output

    );

    -- Success

    RETURN 0

    ;

    END;

    GO

    -- === TEST ===

    -- Variables

    DECLARE @ID BIGINT,

    @rc INTEGER;

    ;

    -- Simulate an enclosing user transaction

    BEGIN TRANSACTION

    ;

    -- Allocate the next id

    EXECUTE @rc =

    dbo.Allocate @ID OUTPUT

    ;

    -- Show locks

    SELECT *

    FROM sys.dm_tran_locks AS TL

    WHERE TL.request_session_id = @@SPID

    ;

    -- Show procedure return code and allocated ID

    SELECT return_code = @rc,

    allocated_id = @ID

    ;

    -- Now roll back or commit

    ROLLBACK

    ;

    -- Uncomment this when finished

    -- DROP TABLE dbo.SneakyIdentity;

    -- DROP PROCEDURE dbo.Allocate;

  • That's a fairly nice solution to the problem, however...(There's always a but...)

    The biggest problem I see with your approach is that when someone wants to add a new key-value pair they also need to add a new table. I dislike needing to alter the design of the database just because someone wants to add a data value.

    Therefore: Wouldn't it be easier to use a single nextnum table instead of one for each key? Your get sequence number code would work the same and you wouldn't need to use dynamic SQL.

    --

    JimFive

  • James Goodwin (2/28/2011)


    That's a fairly nice solution to the problem, however...(There's always a but...)

    The biggest problem I see with your approach is that when someone wants to add a new key-value pair they also need to add a new table. I dislike needing to alter the design of the database just because someone wants to add a data value.

    Therefore: Wouldn't it be easier to use a single nextnum table instead of one for each key? Your get sequence number code would work the same and you wouldn't need to use dynamic SQL.

    That is what they started with but they had locking and blocking issues while updating that one nextnum table to obtain the next key value.

    I suspect that they would only add a new key when they add a new table, so instead of just adding one table you add two. Is the solution perfect? No, but it fixed the problem they were encountering without having to do a major redesign of the database or the application.

  • That is what they started with but they had locking and blocking issues while updating that one nextnum table to obtain the next key value.

    I think I was unclear.

    I am suggesting that instead of one dummy table for each key-value, you create a single dummy table and grab the next identity value from it just as in your solution. The only difference is that the dummy table is shared amongst all of the key value pairs. This may, of course, leave gaps in the sequence for each key-value but you have stated that that doesn't matter in this case.

    --

    JimFive

  • Grigore Dolghin (2/25/2011)


    I am afraid this approach is even worse. What if for each of those keys the developer has to insert some records in a related table, then use those keys to insert related records in a third table? and what if those keys are actually autoincrement values and can't be obtained beforehand? and what if the values have to be generated taking in account the other possible users working in the same time? (such as secvential invoice numbers?)

    Locking must be implemented on database side, period.

    Please do not label solutions as worst if you do not understand them. What I suggested is that you still keep the locking on the database side, but do not request one key at a time. Any application can request 100 keys at a time and cache them on the application side. I do not see the connection between what I suggested and using the keys in multiple tables. If you must have all the entries in sequential order and can not tolerate missing keys than what I suggest is not usable. By the way in Oracle sequences can have incremental different than 1 same as identity columns in SQL Server.

  • You are correct, and I agree. Identity columns should not be used for business data in that way. My point was that the transaction responsible for assigning the "gapless" sequence in the original example SP would achive this, but the function as re-written would not. I would normally expect to assign such numbers in a batch process and allocate them all or not at the end of the day. In some cases, we see internet buisinesses where the (for example) invoice number has to be applied immidiatly to allow a web based order processing system, coulpled with the business requiremet for no gaps in the sequence of tax invoice numbers. In this specific case, the original stored proceedure works, but the re-write may not.

  • peter.row (2/24/2011)


    Why use the output syntax, table variable and dynamic sql?

    Wouldn't the following be simpler (no permissions issue on table if user running it only has access to SP either) :

    insert into ...

    set @outputId = scope_identity()

    This was my original thought as well, maybe I am missing something.

  • nycgags (3/22/2013)


    peter.row (2/24/2011)


    Why use the output syntax, table variable and dynamic sql?

    Wouldn't the following be simpler (no permissions issue on table if user running it only has access to SP either) :

    insert into ...

    set @outputId = scope_identity()

    This was my original thought as well, maybe I am missing something.

    The scope_identity function has been known to generate incorrect values at times (when queries make use of parallel execution plans). Refer to this link for more information -->

    Ajit Ananthram
    Blog - http://ajitananthram.wordpress.com

  • Paul White (2/26/2011)


    It is possible to avoid the dummy column, ever-growing number of rows in the hidden tables, and avoid locking issues by using an approach based on this example code: (the example just uses a single range, but it is easy to extend the idea)

    --really cool code from page 5 snipped

    Wow, I stared at this code for a good 30 minutes, just admiring the beauty and cleverness and (unsuccessfully) trying to find a flaw with it. I'm sad that I'm discovering this 2 years late, but this is a really cool (and yes, sneaky) trick. Paul, have you ever made a separate blog post detailing this technique? (Or know of a similar one by someone else?)

    I don't have anything new to add, but want to rehash some of the things going on just to clarify for anybody that's confused or missing the point.

    The OP's original problem was that he's using a stored procedure to get a key value from another table and seeing blocking because related calls to that stored procedure are trying to access a value from the same row. Note that even if the stored procedure itself doesn't leave an open transaction locking the rows, it's possible that the calling routine, which the DBA may or may not have control over, is calling the key procedure inside of a long running transaction, leaving long-standing locks in place despite the speed or intentions of the actual sproc. The OP's solution to this is to ensure that, rather than having parallel processes calling the sproc accessing the same row (which leads to the potential blocking), each call will create a new row in the table so that as long as only row locks are used, multiple calls will never block one another. The primary drawbacks of this are the potential need for many tables and the creation of many rows of data which are never used after their initial creation.

    Paul's code doesn't directly address the problem of multiple different applications, although one could easily do the same as the OP and create a new table for each application and still use Paul's trick. One of the cool things about Paul's technique though is that it doesn't result in many rows... in fact, it doesn't result in ANY rows! He's essentially using the table's meta-data to track the keys rather than using the table data itself. And, he's doing it without ANY locking issues at all. (Yes, there's probably an instantaneous lock, but it won't get held even in a long-running transaction.)

    So what's going on in Paul's code that makes it so sneaky? There's a few things, so I'll start from the middle and work out. The heart of it is the INSERT, which uses an OUTPUT clause to save the new IDENTITY value automatically generated by the table (note that, as others have posted, this OUTPUT technique is safer than using SCOPE_IDENTITY() or related functions). The OUTPUT saves the value in a table variable which is critical because table variables do not participate in transactions.

    This becomes important because the INSERT is nested inside a SAVE/ROLLBACK transaction. I'm guessing everybody who reads this is fully aware of BEGIN/COMMIT/ROLLBACK transactions, although I do find that many people don't realize how nested transactions work. Transactions can be named to help clarify nesting levels, however, these names have NO effect on COMMIT/ROLLBACK calls. COMMIT will always commit the innermost transaction, while ROLLBACK will always rollback the outermost transaction, including all nested transactions. The exception to this is when creating a SAVE point, as Paul does. A ROLLBACK can not be limited to a nested BEGIN TRANSACTION, but it can be limited to a SAVE TRANSACTION point.

    So what's happening here is Paul creates a SAVE point right before the insert, then inserts the data, and then immediately rolls back to the save point. The ROLLBACK undoes the INSERT, which includes releasing any locks that were allocated from the INSERT. However, because table variables are not included in transactions, the IDENTITY value which was OUTPUT into the table variable is still present and available for use. Also, the IDENTITY SEED value from the table's meta data isn't included in the transaction and so stays incremented even through the rollback, so that future calls will continue incrementing it rather than repeating values.

    So the net result here is that, because of the save point and rollback, no rows are written to any table and no locks are being allocated. However, we still get back the new identity value, and the table meta data still has a record of what that newest identity value was so that it doesn't repeat. Sneaky indeed.

    It's also worth paying attention to how Paul handles the TRY...CATCH block and his COMMIT transactions. Though not directly related to the intent of the post, they're often misunderstood and this is a great example of how to use them properly.

    Sorry for the wordiness, but hope this helps clarify things.

  • Wow! you folks have been busy! I'm just reading this at 2:pm EDT and by time I got to the end, I forgot what the original purpose was. I had to go back and re-read some sections to catch up.

    We have a mail-order business which also employs a call center and Internet sales. We have from 10-25 people (depending on season) taking orders by phone or from mail and the web runs 24-7. Each entry person uses the same custom OrderEntry program (VB6) which is the front end, SQL '08 is the DB. Each order gets a unique order number. As you can see by the sample of the OrderNumber table below (sorry for alignment), the TYPE of order can determine the ordernumber. Still, if there were only 1 type, it wouldn't matter. Simply put, the calling procedure adds a 1 to the current ordernumber. The next person gets the current ordernumber and adds a 1, etc. etc. the web system pulls from the same table. NOte the column "next availableOrderNumber". In my 2.5 years here, we've never had a collision or duplication and this system has been running since the late 90's. The calling program identifies the type based on location and feeds that as a parameter to the calling sproc. It's just a glorified counter. Sorry if I'm missing the point.

    CatalogYear OrderType Range Description NextAvailableOrderNumber OrderNumberRangeLow OrderNumberRangeHigh

    2012 C 1 Credit Card Orders 0 400000 499999

    2012 C 2 Credit Card Orders w/Cards 0 570000 579999

    2012 C 3 Credit Card Orders (FAX) 0 890000 895999

    2012 C 4 Credit Card Orders (EMail) 0 580000 589999

    2012 C 5 Credit Card Orders (999999) 0 870000 873999

    2012 C 6 Credit Card Orders (WEB Site) 0 580000 599999

    2012 E 1 Employee Charge 0 899000 899999

    2012 R 1 Regular (Check) Orders 0 600000 799999

    2012 R 2 Regular (Check) Orders 0 855000 857999

    2012 R 3 Regular(Check) Orders (999999) 0 855000 857999

    2012 T 1 Telephone Orders 100219 100000 129999

    2012 W 1 WalkIn Charge 0 897500 897999

    2012 X 1 WalkIn Check 0 897000 897499

    2012 Z 1 Employee Check 0 898000 898999

    Ajit, I'm more curious about how you scheduled your diagnostic procedure to run and find the blocks. I've copied and run it, but it found nothing, because I don't think anything was actually going on at the time. Did you just run it once, or did you have it repeat periodically, knowing the approximate time the offending procedure would run?

    Wallace Houston
    Sunnyland Farms, Inc.

    "We must endeavor to persevere."

  • This solution would appear to be a sound approach to resolve the locking issue. However, it seems that perhaps a simpler approach would be to move the call to the function to get the key out of the transaction. Call the function to get the key; if it is null then error out; if it is non-null proceed with transaction processing.

  • NBSteve (3/22/2013)


    Paul White (2/26/2011)


    It is possible to avoid the dummy column, ever-growing number of rows in the hidden tables, and avoid locking issues by using an approach based on this example code: (the example just uses a single range, but it is easy to extend the idea)

    --really cool code from page 5 snipped

    Wow, I stared at this code for a good 30 minutes, just admiring the beauty and cleverness and (unsuccessfully) trying to find a flaw with it. I'm sad that I'm discovering this 2 years late, but this is a really cool (and yes, sneaky) trick. Paul, have you ever made a separate blog post detailing this technique? (Or know of a similar one by someone else?)

    Thanks, and no I have never blogged about it - it seemed a bit of a narrow use case to be honest. Your explanation is spot on by the way.

  • Ajit, I'm more curious about how you scheduled your diagnostic procedure to run and find the blocks. I've copied and run it, but it found nothing, because I don't think anything was actually going on at the time. Did you just run it once, or did you have it repeat periodically, knowing the approximate time the offending procedure would run?

    Thanks for your question. I wrote this article 2 years ago (it was republished yesterday), and back then I used to use the diagnostic script in a job which executed the script in a loop with a pause of a few seconds (i.e. with a WAITFOR DELAY statement) and had some logic to log to a table. The script only captures details of locks that exist when it executes (i.e. at that instant).

    If you are looking for a more customisable and optimised version of such a diagnostic script, I'd highly recommend MVP Adam Machanic's sp_whoisactive stored procedure. The latest version of this procedure is v11.11 at the time of writing this comment, and this can be obtained from the following location --> (http://sqlblog.com/blogs/adam_machanic/archive/2012/03/22/released-who-is-active-v11-11.aspx)

    Ajit Ananthram
    Blog - http://ajitananthram.wordpress.com

Viewing 15 posts - 46 through 60 (of 63 total)

You must be logged in to reply to this topic. Login to reply