Which value should we use as a PK auto incremented or padded value???

Question

Post reply

Which value should we use as a PK auto incremented or padded value???

sumit.joshij

Ten Centuries

Points: 1040
More actions
April 3, 2010 at 1:38 pm

#220034

While maintaining a database which values should we use as a PK. is it preferable to use auto incremented or our self generated padded values

say self generated Padded value (in city master = "CTY_00001".."CTY_00009".."CTY_00010" ...."CTY_00999") and so on

or auto incremented column (1,2...9..10.......99......999) and so on

Consider the performance issues which is faster.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

Steve Jones - SSC Editor SSC Guru Points: 728159 More actions · Answer 1

The identity property, if that's what you are referring to, is optimized and will be faster than a routine you write.

That being said, this is a simple CLR routine that could potentially be close to the performance of an identity property.

Follow me on Twitter: http://www.twitter.com/way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com

Paul White SSC Guru Points: 150457 More actions · Answer 2

sumit.joshij (4/3/2010)

While maintaining a database which values should we use as a PK. is it preferable to use auto incremented or our self generated padded values

say self generated Padded value (in city master = "CTY_00001".."CTY_00009".."CTY_00010" ...."CTY_00999") and so on or auto incremented column (1,2...9..10.......99......999) and so on

Consider the performance issues which is faster.

A column with the IDENTITY property has the following advantages:

1. Asynchronous and non-blocking internal 'next value' implementation for very high concurrency

2. Typically very compact - 4 bytes for an INTEGER type

3. Extremely fast allocation routine

4. Can be combined with a computed column to provide 'padded values'

Disadvantages:

1. May leave 'gaps' in the sequence due to aborted transactions.

2. Cannot know the assigned values in advance.

3. Multi-row operations not guaranteed to receive a contiguous range of values.

Advantages of generated padded values:

1. Values can be allocated before executing the INSERT (useful for batch operations).

Disadvantages:

1. Allocation routine is typically synchronous and blocking. Poor concurrency.

2. Larger keys (9 bytes) - lower page density, bigger non-clustered indexes if PK is clustered.

3. String searches and comparisons are typically very much slower than when using INTEGERs.

4. Typically 10-100 times slower to allocate than IDENTITY.

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 3

Paul White NZ (4/4/2010)

sumit.joshij (4/3/2010)

While maintaining a database which values should we use as a PK. is it preferable to use auto incremented or our self generated padded values

say self generated Padded value (in city master = "CTY_00001".."CTY_00009".."CTY_00010" ...."CTY_00999") and so on or auto incremented column (1,2...9..10.......99......999) and so on

Consider the performance issues which is faster.

A column with the IDENTITY property has the following advantages:

1. Asynchronous and non-blocking internal 'next value' implementation for very high concurrency

2. Typically very compact - 4 bytes for an INTEGER type

3. Extremely fast allocation routine

4. Can be combined with a computed column to provide 'padded values'

Disadvantages:

1. May leave 'gaps' in the sequence due to aborted transactions.

2. Cannot know the assigned values in advance.

3. Multi-row operations not guaranteed to receive a contiguous range of values.

Advantages of generated padded values:

1. Values can be allocated before executing the INSERT (useful for batch operations).

Disadvantages:

1. Allocation routine is typically synchronous and blocking. Poor concurrency.

2. Larger keys (9 bytes) - lower page density, bigger non-clustered indexes if PK is clustered.

3. String searches and comparisons are typically very much slower than when using INTEGERs.

4. Typically 10-100 times slower to allocate than IDENTITY.

I absolutely agree... why reinvent the wheel especially when the reinvented wheel frequently comes out square? 😛 I'll also state that things like "CTY_00999" end up having a rather low limit (100,000-1 in the case cited). Yeah, I know... some folks will justify the "format" by saying things like "well... we'll never have that many rows". Those are the same people that end up with a mess on their hands because the scope or requirements changed over time including such classic management decisions such as (really happened at an old job) "let's give all Canadian customers ID's starting at 100,000".

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

sumit.joshij Ten Centuries Points: 1040 More actions · Answer 4

sumit.joshij

Ten Centuries

Points: 1040

April 4, 2010 at 12:10 pm

#1145490

Ok.

the padded value should be like any systimatic value say (00001...00009.....00010....00100...001000) like that. in this it support/possible the reverse indexing. But in the auto incremented value the reverse indexing is not possible. So I use this type of capturing data.

Paul White SSC Guru Points: 150457 More actions · Answer 5

sumit.joshij (4/4/2010)

Ok.

the padded value should be like any systimatic value say (00001...00009.....00010....00100...001000) like that. in this it support/possible the reverse indexing. But in the auto incremented value the reverse indexing is not possible. So I use this type of capturing data.

I'm sorry, but I have read that a few times now, and still don't see what you mean by 'reverse indexing'...? Could you explain it a little more please?

Oh, and in case in wasn't clear from my earlier post, let me summarise:

Using a generated padded value is probably a really bad idea.

Use a column with the IDENTITY property.

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi