Primary key vs NOT NULL unique key..

  • Just being curious that what are differences between a primary key and a NOT NULL unique key...

    One difference could be a primary key has a clustered index while a unique key has non clustered index..

    so if i create a column like..

    UNIQUE + NOT NULL + Clustered index

    How will it differ from a primary or will the above column be the same as a primary key?

  • PK does not have to be clustered. you can have a non-clustered pk. if you don't specify when you create it though, sql server defaults to clustered for PKs.

    I was going to say that one difference is you wouldn't be able to create a FK pointing to the column with the unique index. That is, I thought a FK has to point to a PK. But then I tried it and found that you can.

    so maybe there is no difference.

    ---------------------------------------
    elsasoft.org

  • You wont be able to include the table in replication if you only have a unique non-null key, and not a PK.

  • A PK is mainly for DA-purposes, to document the key that will be used to support FK-references to your table. ( qa.sqlservercentral.com/articles/Advanced/coddsrules/1208/

    )

    Nowadays many rdbms systems only need a unique key to use as a FK reference.

    I prefer to have a small primary key and use that one for DRI purposes.

    This way I can use surrogate keys to avoid db-impact if the "natural" unique key (e.g. a unique name column) needs to change characteristics (max length,..)

    DA/DBA tools like erwin, ... will show special graphics for PK info and will create fk-info only for pks.

    With sqlserver, if you create a PK, it will create the pk clustered if there isn't already a clustered index for the object, unless you specify the pk to be nonclustered.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • It depends on how you want to use them.

    You can have both on the same table, just so long as only one of them is clustered.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • steveb (4/28/2008)


    You wont be able to include the table in replication if you only have a unique non-null key, and not a PK.

    good point. that's a real difference.

    ---------------------------------------
    elsasoft.org

  • GSquared (4/28/2008)


    just so long as only one of them is clustered.

    It is not mandatory to have a clustering index, but I'd advise to always have one.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • ALZDBA (4/28/2008)


    GSquared (4/28/2008)


    just so long as only one of them is clustered.

    It is not mandatory to have a clustering index, but I'd advise to always have one.

    My point was meant to be more along the lines of "only one of them can be clustered, not both".

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Oh yes, there can only be one clustering index per table.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • A key is any attribute (column) or combination of attributes which uniquely identify the tuple (row). That is, there should not be more than one row with the same value(s) in the key.

    Some tables, such as history or log tables, do not have readily discernible keys. But most transactional tables do have at least one obvious key, and perhaps several. These are candidate keys. Of all the candidate keys, only one is chosen to be the Primary Key. There can be any number of reasons to choose one candidate key over all the others, not all of which are technical in nature. In fact, a surrogate field can be created for the express purpose to serve as the primary key.

    For example, the Employees table may have EmployeeID and SSN (Social Security Number). Each of these uniquely identity each employee. However, the SSN contains sensitive information and should not be propagated throughout the database, like it would be if used in foreign key relationships. So EmployeeID is the obvious choice.

    In recent years, there has been a move to always create a surrogate key field, even though one or more perfectly useful candidate fields may already exist. There are pros and cons about doing this; the biggest con I have seen is that designers then have a tendency to skip identifying the candidate keys. This is a mistake. All keys of an entity should be identified and a NOT NULL unique constraint defined for them.

    The SSN field in the Employees table, although not chosen to serve as the PK, is still a key. There should never be more than one employee with the same social security number and any attempt to insert a second row with the same value for SSN as any existing row should be caught as it happens. The field should still be uniquified so that can never happen. So even if you decide to use a surrogate PK, you still have to take the effort to correctly identify all the key fields of the table.

    Tomm Carr
    --
    Version Normal Form -- http://groups.google.com/group/vrdbms

  • The problem with having a unique key on SSN is that it only allows 1 row with null. But what happens if you have employees in other countries where they don't have SSNs? Or have temporary employees from other countries? There are legal situations in which an employee might not have a SSN. There have also been (rare) circumstances in which the same SSN has been issued to two or more people, per the Social Security Administration.

    Sure, if you're a small, local shop, with only a few employees, that might work. But be prepared for having to deal with employees who live in places like India and China, and the resultant problems in your database when that happens.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • The problem with any candidate key is that someone can come in afterwards and "what if" it to death. You can take any attribute and keep expanding the context to the point where it is no longer unique. Apply that as a general principle and what you get is "there is no such thing as a unique attribute." Then where are you?

    Yes, part of the uniqueness of any attribute is the context of the data you will be working with. Part of the process of sound data modeling is properly identifying the context as well as properly identifying all the candidate keys that are unique within that context. It is sheer laziness and shoddy work to throw in the towel at every turn and tie uniqueness for all of your application's transactional data to a system-generated value and nothing else. You have to draw boundaries somewhere.

    Another sound data-modeling technique is to properly identify the wag who starts every sentence with "What if..." and get him out of the design meetings.:P

    Tomm Carr
    --
    Version Normal Form -- http://groups.google.com/group/vrdbms

  • Tomm Carr (4/29/2008)


    The problem with any candidate key is that someone can come in afterwards and "what if" it to death. You can take any attribute and keep expanding the context to the point where it is no longer unique. Apply that as a general principle and what you get is "there is no such thing as a unique attribute." Then where are you?

    Yes, part of the uniqueness of any attribute is the context of the data you will be working with. Part of the process of sound data modeling is properly identifying the context as well as properly identifying all the candidate keys that are unique within that context. It is sheer laziness and shoddy work to throw in the towel at every turn and tie uniqueness for all of your application's transactional data to a system-generated value and nothing else. You have to draw boundaries somewhere.

    Another sound data-modeling technique is to properly identify the wag who starts every sentence with "What if..." and get him out of the design meetings.:P

    Unfortunately for me, the person who "what ifs" is me, and I'm also the one ultimately responsible for the design.

    The only reason I mention SSN as a problem is because SO many people think "oh, that's perfect!", and use it as a primary or unique key, and then cause the problems I outlined.

    One database I'm currently working with, the designer had the realization that "oooohhhh, e-mail addresses are unique, let's make that the natural key for customers!" He didn't take it the necessary step further and realize that he himself has 3 e-mail addresses, which is not an uncommon occurance, nor did he account for multiple people (a customer and the customer's executive assistant, for example), accessing the same e-mail account. So, we have to have multiple entries for the same customer, and very oddball many-to-many joins, because of this key. (The next version of the database will have e-mails in a sub-table, in which it certainly can be a unique key, but not for the customer, just for the e-address.)

    So, yeah, it can be a pain to identify unique keys. There are natural unique keys, and in their absence, there are surrogate keys which can be reliably unique. My point is merely that SSN, specifically (and e-mail addresses), are not.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (4/29/2008)


    The only reason I mention SSN as a problem is because SO many people think "oh, that's perfect!", and use it as a primary or unique key, and then cause the problems I outlined.

    The only reason I used it in the first place was to illustrate the point that there could be non-technical reasons for choosing one key over another to make the PK. The fact that there may also be technical reasons is beside the point really. Come up with a better example.

    But your argument has brought up two points worth commenting on.

    1) Identifying keys can be a lot of trouble. Some designers think that the use of surrogate keys gets you around that trouble. It does not.

    2) Someone may have, at one time or another, chosen an inappropriate field for use as the PK. This can cause any number of problems which may not be easily fixed. But this means that the decision to use a natural key as the PK must be made with care; it does not mean that it can't or shouldn't be done at all. Bridges collapse, but we haven't given up building bridges.

    Another point specific to the use of SSNs. I agree that there are many reasons it should not be used as the PK. However, it is still a key and should be defined as such. (As the SSN is a US-only construct, let's assume that the context is US employees only.) Although not a PK, the payroll department, for one, may prefer to use it as the FK of choice for the relationships in its database. After all, the SSN is central to most of the information it processes -- including data sent to and received from the IRS. The IRS doesn't know EmployeeID from squat -- it wants SSNs. If, as you maintain, the SSA should give out the same SSN to two different people, and if your company should be so lucky as to hire both of those people, then the fact that Payroll is going to be working with two employees with the same SSN is something they are going to want to know about -- right away! If this situation went unnoticed for a while, there is no end of trouble they could be getting themselves into. So, while the SSN should never be defined as the Primary Key, it should always be defined as a key (not null, unique). The accuracy of your paycheck may depend on it. 🙂

    Tomm Carr
    --
    Version Normal Form -- http://groups.google.com/group/vrdbms

  • Tomm Carr (4/29/2008)


    The problem with any candidate key is that someone can come in afterwards and "what if" it to death. You can take any attribute and keep expanding the context to the point where it is no longer unique. Apply that as a general principle and what you get is "there is no such thing as a unique attribute." Then where are you?

    Yes, part of the uniqueness of any attribute is the context of the data you will be working with. Part of the process of sound data modeling is properly identifying the context as well as properly identifying all the candidate keys that are unique within that context. It is sheer laziness and shoddy work to throw in the towel at every turn and tie uniqueness for all of your application's transactional data to a system-generated value and nothing else. You have to draw boundaries somewhere.

    Another sound data-modeling technique is to properly identify the wag who starts every sentence with "What if..." and get him out of the design meetings.:P

    That would probably be me:) (the wag, that is)...

    I agree with you most of the way. I by no means tie UNIQUENESS solely to a system-generated ID unless the branding iron is hyperheated and being pressed into the side of my gluteus maximus....oh AND there's no other alternative...:)

    That being said - UNIQUENESS is only half of the definition of a primary key, in its core definition. It's an ADDRESSABLE UNIQUE key, whose primary function is to be what "children tables" use to relate to it. The primary characteristic for a primary key (in addition to uniqueness) is immutability, specifically because of its "deisgnated use" in table relations. And that's where I think the natural keys tend to fall down rather quickly, since they are "naturally" prone to changing. Rarely do natural keys tend to be as "thin" or as immutable as arbitrary system generated keys.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Viewing 15 posts - 1 through 15 (of 32 total)

You must be logged in to reply to this topic. Login to reply