Identity Table vs Identity Column

  • Hello Everybody.

    I am trying to work out if it is a major problem to create a table with a single identity column to provide a way of find out the next job number for our Logging system. I am proposing to insert into the table, retrieve the the new number and then delete the row immediately. Then use this number to store directly in the main table which holds the rest of the data we need, rather than use an identity column on the main table itself.

    The reason for this is I am coding in C# and need to store some information before I have all the data to fill in the main table. Having read about sequence tables it would appear that this method could lead to locking problems (mentioned by Jeff Moden) is this method flawed or would it work in this situation.

    I have not got a problem with gaps appearing in the sequence when a identity is obtained but not completed...

    Thank you in advance for your comments

    Jason Shaw

  • In cases where C# or other applications need a row ID before the row is inserted, I recommend GUIDs generated by the app.

    They'll take more space than an ID, but you can generate them in the application without a database trip at all. Just don't use them in the leading edge of any indexes on the table. No contention, reduced network traffic, and you have the ID any time you need it in the front end, for the cost of slightly more storage space. It's probably worth the trade off.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I've seen designs like this in the past and they lead to serious bottlenecks. I'm with Gus. If you have to do this a GUID is your best bet.

    But, be very careful when using GUID as the key on a clustered index because they can lead to performance bottlenecks caused by fragmentation.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Jason you are using C#.NET, right? why not take advantage of the typed data sets and the relations?

    for example, In .NET i can add parent and child records to any depth, and when it comes down to the final update, the dataadapter / provider swaps out my temporary Identity values that were -1,-2, etc for the real keys returned from the server, and then replaces those keys in the child tables, then updates those tables, and then their children , and so on.

    I'd really recommend that instead of trying to maintain the ID's all on your own.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Thank you for your ideas.

    GUID is a new concept to me and I am not sure that I works in this case as it is not good for indexing as this number is used as a reference number for a number of tables that link together.

    The C# .Net data adaptor would be better but I am not sure that I feel happy enough with C# to achieve this correctly at the current time.

    Going back to the original plan, what exactly causes the bottleneck. Would it be better to have a table to insert a number of entries and have a scheduled job to periodically remove all entries to keep the table size down to a minimum. We would probably be looking at a maximum of 300 entries per day, so is there likely to be a bottleneck ?

    Jason

  • jasonshaw (5/23/2011)


    Going back to the original plan, what exactly causes the bottleneck.

    The fact that you have to serialise access to that table (one connection at a time) to avoid duplicate key values.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I am assuming that this serializing access to this table is handled by Sql Server internally.

    Having a identity field on the main table would also cause the same serailization of entry of the main table, I would expect that this would be more of a bottle neck as this table has updates run on it as well so there would be more traffic to this table. This table is also highly accessed for reading only for reporting purposes so is the problem really with identity columns as a whole?

    Jason

  • jasonshaw (5/23/2011)


    I am assuming that this serializing access to this table is handled by Sql Server internally.

    No, it's handled by you, in code.

    If your code assumes there's only ever one row in that table, then you have to be completely sure that there is only one connection executing the insert, select, delete code at a time, otherwise you could have two connections insert a row, then both select, because the code assumes there's only one row in the table one of them gets the other's ID, tries to insert and you get duplicate key errors.

    If you're going to use @@identity rather than selecting the new value it's a lot safer and the serialisation isn't necessary.

    Having a identity field on the main table would also cause the same serailization of entry of the main table

    Yes, but in that case it's handled as part of the insert operation and is optimised very well, not as a separate step as you're planning. Depends on how exactly you're going to implement it.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thank you for that GilaMonster.

    I was planning to use the @@IDENTITY function to get the job number back, I don't think I had made it clear enough.

    jason

  • jasonshaw (5/23/2011)


    Thank you for your ideas.

    GUID is a new concept to me and I am not sure that I works in this case as it is not good for indexing as this number is used as a reference number for a number of tables that link together.

    The C# .Net data adaptor would be better but I am not sure that I feel happy enough with C# to achieve this correctly at the current time.

    Going back to the original plan, what exactly causes the bottleneck. Would it be better to have a table to insert a number of entries and have a scheduled job to periodically remove all entries to keep the table size down to a minimum. We would probably be looking at a maximum of 300 entries per day, so is there likely to be a bottleneck ?

    Jason

    Gail already outlined the bottleneck.

    You can use GUID as a PK and FK on tables just fine. It will work well. I just don't recommend it as a clustered index is all.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • I tested GUID column and run into perfromance problems for this exact reason to return the unique value for application that generates a service requests . My solution was to use custom written SQL sp that generates 10 positions string with random number and random sequence of chars. This lended into much more efficient process without using Identity or GUID and accomplished my goal. It is working already a few years with no problems.

  • jasonshaw (5/23/2011)


    Thank you for that GilaMonster.

    I was planning to use the @@IDENTITY function to get the job number back, I don't think I had made it clear enough.

    jason

    As a recommendation look into scope_identity() as well so you understand where @@IDENTITY can cause some issues.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • You CAN use a sequence table but not quite the way you have it. You don't need to do an insert/delete... you can do an UPDATE using the well documented 3 part UPDATE. You need to use the 3 part update to avoid the need for serialization and the very high potential for deadlocks that brings on.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply