Identity field as a key

  • quote Take the same row, delete it then insert it back again. Gee, all of the PK-FK references are trashed, even tho the data is still the same.

    If the key (identity or natural) is subject of a FK constraint then you cannot delete it anyway

    quote Next, the space arguments fails because in a properly designed RDBMS, you will enforce a natural key; the IDENTITY is redundant.

    There is no such thing as a 'properly designed RDBMS' that is a term used by people who think they are the defacto standard

    I have seen databases that are supposedly 'properly designed' and 'obey all the rules' and found them to be a bl**dy mess, a nightmare and poor performance

    quote Speed of joins? This is the 21-st century and unless you are trying to a Wal-Mart sized Data warehouse on SQL Server, 32-bit and 64-bit machines running at nano-second speeds can handle a lot of data.

    Oh shame on you That is modern day thinking. If the query does not run fast enough throw more memory and processors at it, irrespective how badly written it is. Yes they can handle a lot of data but they can do it very badly as well.

    Far away is close at hand in the images of elsewhere.
    Anon.

  • >> Yes it is; creationism versus scicnce

    Once again, opionated crap.  Useless feedback.

    >> Things are worse if you trry to ue an IDENTITY over multiple tables.  Unlike the old IMS/IDMS/TOTAL/network DB pointer chains, there is no utility program to validate and restore them. 

    I don't think a key (in a looser meaning) necessarilly has to be meaningful and uniquely describes an entity.  Yeah, I know that is the true definition of a key, but think of real word examples.  Does a SS# describe a person?  It is a number that really has no meaning other than an attempt to uniquely IDENTIFY a person.  Same logic holds for license plates, VINs, license numbers, sales order number, etc.  If you really think about it, there is hardly a true key to identify an entity.  So, what is wrong with doing it in the relational db world?

    If you can assign a value (ie number) to an instance of an entity that uniquely identifies that entity, then it is acceptable to use it as a key.  If non-modeling terms...if a row can be uniquely identified using an uniquely assigned value, then you can make that column a key.  And, Joe, that does include ROWGUIDCOL property of the uniqueidentifier type.

    >> Take the same row, delete it then insert it back again.  Gee, all of the PK-FK references are trashed, even tho the data is still the same.

    Don't you have foreign key contraints created, as well?  You should, and this will not happen and cannot ever happen.  It would violate Dr. Codd's principle of a primary key and foreign key relationship.

    >> A sequential numbering hs to be done with a sequential process.  Kiss any hope of parallelism goodbye.  Factor in the cost of the physical reads and the time to increment the counter. 

    You can use GUID if you need "parallelism".  oh, as you stated, think "nano-seconds", Joe.  This is the 21st century.

  • Aah, see, it's getting heated again

    Anyway...

    Next, the space arguments fails because in a properly designed RDBMS, you will enforce a natural key; the IDENTITY is redundant. 

    What is a properly designed DB anyway? You surely can have a sound schema, even with IDENTITIES. If all else is "perfectly" designed, the use of IDENTITIES is IMO an acceptable "flaw". But mostly you'll face a poor schema *and* IDENTITIES. And that's what actually makes it worse.

    Speed of joins?  This is the 21-st century and unless you are trying to a Wal-Mart sized Data warehouse on SQL Server, 32-bit and 64-bit machines running at nano-second speeds can handle a lot of data. 

    I disagree here. Operations on, say INT columns (usually the case with IDENTITIES) tend to be much faster than operations on CHAR or VARCHAR. Even true in the 21st century. And even true if you use a binary collation. Especially on VARCHARs where column width isn't fixed and has to be taken into account when sorting.

    However, I really think that the relational model is not all about performance and fastest data access in the first place. If it were that way, Dr. Codd wouldn't have invented it, or at least invented it completely different. He would rather have developed hierarchical or network databases further.

    It is really interesting when you follow online communities, you ever so often see people asking questions like "Why is SQL Server (or any other Enterprise RDBMS for that matter) soooo slow. After all, it is a RDBMS?".

    People may seem to mistakenly consider achieving miracle lightning-fast performance only by porting from a garage-based start-up suitable desktop database to a full-blown Enterprise RDBMS. That is kinda pathetically hilarious. A well designed Access database will outperform any poorly designed SQL Server database. You can hide this fact for some time by better hardware, more RAM. But then comes a point when you must drop your trouser and face the fact that you can't get more performance out of a bad schema.

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Okay, but that does not address the issue at hand that we are arguing which is "why is it wrong to use an IDENTITY column as a primary key?"  There is no need to validate and verify an identity column.  It is what it is.  It's just a unique number assigned to an instance of an entity.

    I may have a table with SS# or a VIN number, but I would rather use an IDENTITY column for the key than either of those data items.  A natural key is not always the best to use.  A car is owned by a driver and a driver owns one or more cars.  The car has one of more serialized parts.  Making the VIN the key for the car table may cause you problems down the road.  Unless you have cascade update on it is cumbersome to fix the keys for the CarParts table which would be the VIN + part No in the case that the wrong VIN was entered.  With the Identity column as key, you're just updating the Car table  with the correct VIN since you are not modifying the key itself.

    Also, I don't disclose an internal key.  I would still have the VIN in the table.  It just would not be the primary key and the end user would never see the identity value.

    >> I am one of the few people that has actually dealt with the design of encosing schemes in detail.  Newbies are not taught (and often do not wan to learn) how to design an encoding scheme.  It's hard work and takes brains; putting an IDENTITY on every table is soooo much eaier and quicker. 

    Since when is "easier and quicker" not what people with "brains" do?  If you want to impress yourself, go right ahead, but the bottom line is the end user doesn't care what's going on behind the scenes (aside from sloppy work).  There benefits as posed by many here to using the IDENTITY column as a key.  If you design schemas (which I do) correctly, use of an Identity column can ease the pain of joins, UI development, etc. 

  • Sad to say Ron never made that statement. Check out his comments on the article -

    http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=6136

    I am not commenting on the article altogether, but only a quote attributed to me, which I have never made. I certainly believe that sometimes Identity is a perfectly good choice as a PK. I have never said or believed otherwise. (Identity is not necessarily a great choice as your clustering index - but thats another discussion, and thats a physical issue, not a logical one. And you certainly could have your PK be on your identity column, and your clustered index on something else in any case.)

    The author says he is 'paraphrasing' me. All I can think of is that he is referring to a comment I wrote in Inside SQL Server, where I decry the use of cursors purely to mimic ISAM style behavior with a series of nested selects. I said that people often do this as a quick and dirty way to port an ISAM app to SQL Server, but I viewed that approach only for people who think there is not enough time to do things right, but always enough time to do it over. My comment had nothing to do with identity as a PK choice. His paraphrasing of a quote of mine is towards an entirely different subject.

    A better tongue in cheek saying that I'd use in this discussion, where I think Identity CAN be a perfectly suitable PK (but as always it depends), is "Generalizations are always wrong." Regards,

    Ron

    Edited by - Ron Soukup on 11/30/2001 20:05:15

    Joe - it's a good quote, just not true.

    Quand on parle du loup, on en voit la queue

  • Patrick,

    thanks for digging this out! I knew I read it somewhere, but apparently didn't work through the discussion on it. They should have edited the article on SQL Team anyway thereafter

    Be it as it is, I'll second Joe here. It's a great quote. No matter who actually made it 

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Ok I am going to throw in my opinion here and anyone who just wants to can throw back.

    With the case of using an indentity column it usually boils down to not having enough information to discover a natural key so instead

    you substitute an artifical one.

    And sometimes what appears to be the best natural key to begn with actually turns out to fail at a later point.

    And example of this is Social Security Number. This is a number that identifies a US taxpayer uniquely from other taxpayers. On the surface this number would seem like a great natural key if you have but in reality you are missing a key piece of info if your company lasts a long time.

    The problem most people donm't take into account is that SSN is created thru a rules based system, each segment being generated based on a condition such as state where you reside at the time you apply.

    Over a period of time the number can and will be reused so you need to add to that one additional element to truely have unique, you need the start use date of the person (technically you could have gotten assigned later than your birth date so birth date I wouldn't call the best choice).

    Now that you have the start use date you have a true unique key for the person as it cannot be reassigned while they are alive, thus your key would actually be SSN and the start use date.

    Same thing would apply with telephone numbers, these numbers can be reassigned but not while in use so a start use date completes the key.

    Now as with a system that stores books, it would be better to assign using the key of ISBN (may be an occasinal issue) that an artifical key but it comes down to if you are collecting that much information.

    If not then you need to assign the record a unique value to represent it in other tables so when you talk about "How To Kill a Mockingbird" published in 1981 by "Jones and Wade" versus "How To Kill a Mockingbird" published in 1987 by "Milton Burl Books" you have a reusable value that traces back to one and only one version. You then create a indentity field. Personally I do think ISBN is easy enough to collect and a better key.

    Now with AUthor you will have very limited details so an Identity column is going to be your best choice for unique value.

    As for your layout I would do like so.

    Table Author

    -------------

    AuthorID (identity Field, key)

    Table Book

    -----------

    BookID (identity Field, key)  (ISBN is preferred).

    AuthorID (foreign key) -- I haven't seen an author change.

    Edition

    -------

    EditionID (Editon Number, key) -- Assume 1 on any edition without

    Printing (Printing Number, key) -- Many only go thru 1 printing but you can have additional print runs and this will give you better tracking

    Print_Date (Date of printing, datetime) -- You need this either way, If you don't use Printing then this should be the date of the first print run.

    BookID  (foreign key)

  • Just out of curiosity:

    Would anyone use the SSN as Primary Key? What about privacy or data protection in this case?

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Not to mention the laws governing SSN actually say it cannot be used as a form of identification except by certain orginizations. Just an example but I am sure someone has used it that way.

  • SSN would also limit the users to the US (unless we can convince our other employees in other countries of the beauty of having a SS card).

    The good thing is that SSN are not supposed to be reused (unless you want to mention the slight problem of identity theft).

    Quand on parle du loup, on en voit la queue

  • Thanks for the explanation, Joe.

    Not that I as a German deal terribly much with SSN, however as it is quite often mentioned it's good to have some background information about it.

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

Viewing 11 posts - 16 through 25 (of 25 total)

You must be logged in to reply to this topic. Login to reply