Which value should we use as a PK auto incremented or padded value???

  • While maintaining a database which values should we use as a PK. is it preferable to use auto incremented or our self generated padded values

    say self generated Padded value (in city master = "CTY_00001".."CTY_00009".."CTY_00010" ...."CTY_00999") and so on

    or auto incremented column (1,2...9..10.......99......999) and so on

    Consider the performance issues which is faster.

  • The identity property, if that's what you are referring to, is optimized and will be faster than a routine you write.

    That being said, this is a simple CLR routine that could potentially be close to the performance of an identity property.

  • sumit.joshij (4/3/2010)


    While maintaining a database which values should we use as a PK. is it preferable to use auto incremented or our self generated padded values

    say self generated Padded value (in city master = "CTY_00001".."CTY_00009".."CTY_00010" ...."CTY_00999") and so on or auto incremented column (1,2...9..10.......99......999) and so on

    Consider the performance issues which is faster.

    A column with the IDENTITY property has the following advantages:

    1. Asynchronous and non-blocking internal 'next value' implementation for very high concurrency

    2. Typically very compact - 4 bytes for an INTEGER type

    3. Extremely fast allocation routine

    4. Can be combined with a computed column to provide 'padded values'

    Disadvantages:

    1. May leave 'gaps' in the sequence due to aborted transactions.

    2. Cannot know the assigned values in advance.

    3. Multi-row operations not guaranteed to receive a contiguous range of values.

    Advantages of generated padded values:

    1. Values can be allocated before executing the INSERT (useful for batch operations).

    Disadvantages:

    1. Allocation routine is typically synchronous and blocking. Poor concurrency.

    2. Larger keys (9 bytes) - lower page density, bigger non-clustered indexes if PK is clustered.

    3. String searches and comparisons are typically very much slower than when using INTEGERs.

    4. Typically 10-100 times slower to allocate than IDENTITY.

  • Paul White NZ (4/4/2010)


    sumit.joshij (4/3/2010)


    While maintaining a database which values should we use as a PK. is it preferable to use auto incremented or our self generated padded values

    say self generated Padded value (in city master = "CTY_00001".."CTY_00009".."CTY_00010" ...."CTY_00999") and so on or auto incremented column (1,2...9..10.......99......999) and so on

    Consider the performance issues which is faster.

    A column with the IDENTITY property has the following advantages:

    1. Asynchronous and non-blocking internal 'next value' implementation for very high concurrency

    2. Typically very compact - 4 bytes for an INTEGER type

    3. Extremely fast allocation routine

    4. Can be combined with a computed column to provide 'padded values'

    Disadvantages:

    1. May leave 'gaps' in the sequence due to aborted transactions.

    2. Cannot know the assigned values in advance.

    3. Multi-row operations not guaranteed to receive a contiguous range of values.

    Advantages of generated padded values:

    1. Values can be allocated before executing the INSERT (useful for batch operations).

    Disadvantages:

    1. Allocation routine is typically synchronous and blocking. Poor concurrency.

    2. Larger keys (9 bytes) - lower page density, bigger non-clustered indexes if PK is clustered.

    3. String searches and comparisons are typically very much slower than when using INTEGERs.

    4. Typically 10-100 times slower to allocate than IDENTITY.

    I absolutely agree... why reinvent the wheel especially when the reinvented wheel frequently comes out square? 😛 I'll also state that things like "CTY_00999" end up having a rather low limit (100,000-1 in the case cited). Yeah, I know... some folks will justify the "format" by saying things like "well... we'll never have that many rows". Those are the same people that end up with a mess on their hands because the scope or requirements changed over time including such classic management decisions such as (really happened at an old job) "let's give all Canadian customers ID's starting at 100,000".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Ok.

    the padded value should be like any systimatic value say (00001...00009.....00010....00100...001000) like that. in this it support/possible the reverse indexing. But in the auto incremented value the reverse indexing is not possible. So I use this type of capturing data.

  • sumit.joshij (4/4/2010)


    Ok.

    the padded value should be like any systimatic value say (00001...00009.....00010....00100...001000) like that. in this it support/possible the reverse indexing. But in the auto incremented value the reverse indexing is not possible. So I use this type of capturing data.

    I'm sorry, but I have read that a few times now, and still don't see what you mean by 'reverse indexing'...? Could you explain it a little more please?

    Oh, and in case in wasn't clear from my earlier post, let me summarise:

    Using a generated padded value is probably a really bad idea.

    Use a column with the IDENTITY property.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply