Using IDENTITY as a key column

  • The first part of the question was fine, but the second "SQL Server does not have an efficient way to retrieve rows based on a known value for PersonID." is not only poorly worded but the answer is just plain wrong.

    SQL Server DOES IN FACT HAVE an efficient way to retrieve rows based on a known value for PersonID.

    Here are three ways to increase the efficiency of retrieving rows based on known values for the PersonID column

    1. Adding a Primary Key to that column

    2. Adding a Unique Index or Unique Constraint to that column

    3. Adding a general Index based on the PersonID column

    The question should have read something like this:

    "By default SQL Server does not have an efficient way to retrieve rows based on a known value for PersonID."

  • sknox (4/20/2010)


    Hugo Kornelis (4/20/2010)


    Tom Brown (4/20/2010)


    A Good question. I learned something from this.

    I've used IDENTITY for ages, assuming its both duplicate-proof and automatically indexed.

    I'm glad to have this pointed out to me here rather than in a live production environment.

    Thanks, Tom! This is exactly why I submitted this question, because I see this misunderstanding way too often. I am glad I was able to learn you something today!

    I got this question right, but was very afraid of getting called "wrong", because I've seen that assumption about IDENTITY a lot. If I'd just looked at who wrote the question, I'd have had no fear!

    It's a pity, too, because you can just add a unique index constraint to the IDENTITY column to get the behavior you want.

    Man, oh man. I have been told my soooo many experienced DBAs and trainers that an IDENTITY will NOT be duplicated under any conditions. This is a myth that gets promulgated by trainers and those who should know better. And, it's a very dangerous one. Good thing I put a unique constraint when I use IDENTITY. Thanks for this Hugo.

    I just got a brainstorm (and boy, does it hurt) that someone should write an article about SQL Server myths.

  • JT-361866 (4/20/2010)


    The first part of the question was fine, but the second "SQL Server does not have an efficient way to retrieve rows based on a known value for PersonID." is not only poorly worded but the answer is just plain wrong.

    SQL Server DOES IN FACT HAVE an efficient way to retrieve rows based on a known value for PersonID.

    Not with the table as given in the question. All of the ways you list involve modifying the table itself, which is outside of the scope of the question.

  • WayneS (4/20/2010)


    Toreador (4/20/2010)


    "SQL Server does not have an efficient way to retrieve rows based on a known value for PersonID."

    There have obviously been too many trick questions lately, as I thought this was one as well!

    SQL Server does have an efficient way to retrieve rows based on a known value for PersonID, and here it is:

    create index indPersonID on Persons (PersonID)

    😉

    This was my reasoning also. Just as the first part (duplicates) would require manual intervention to accomplish, so to would the implementation of SQL Server's efficient way to retrieve rows.

    I was thinking along the same lines as well, but since the default behavior is to not create an index, and no index was created, and there could be many duplicates with that identity - i decided that the SQL would not query it efficiently.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Michael Poppers (4/20/2010)


    Hi, Hugo! and thanks for your QotD (which, despite my best efforts, I got right :-)). Just wanted to note (and I'm shocked, SHOCKED, that with 27 posts thus far in this discussion, no one yet noted this) something which I trust everyone would agree is really true: your CREATE TABLE will fail with a syntax error because you commented out the right parentheses and listed the last column with a comma!

    All the best from Michael

    If you take that code and rewrite it as such:

    CREATE TABLE Persons

    (PersonID int NOT NULL IDENTITY,

    FirstName nvarchar(40) NOT NULL,

    MiddleInitial char(1) NULL,

    LastName nvarchar(40) NOT NULL,

    DateOfBirth date NULL,

    );

    Note that the comma following the last column is still present. SQL Server will automatically ignore this last comma and the command will still complete successfully. However, the commenting of the closing parenthesis will cause an error.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Very intelligent question. I got this wrong though :(. The second part of the question tricked me into thinking that the SQL Server will have an efficient way to retrieve rows based on the IDENTITY column since records are arranged sequentially, however this is not true coz there is no clustered index or PK constraint defined on the PersonID.

    Amol Naik

  • Good question, Hugo!

    I would change "Because of the IDENTITY attribute, no duplicate values for PersonID will occur" to "Because of the IDENTITY attribute, no duplicate values for PersonID can[/i] occur."

    The first statement may or may not be true depending on whether someone overrides the default INSERT behavior so checking the box for that choice may or may not be correct. However, the second statement is definitely not true.

    Thanks. . .

  • Hugo, Great question. Made me think and analyze since in the "old days" the only way to regenerate the identity key was to drop the table and recreate it then reload the data.

    It is a shame that there are those who want to rewrite the question to make their answer right. JT, the question was pretty specific and if one reads the answer it does apply to the question. Maybe grammer is a little off, but it is a great question.

    Steve Jimmo
    Sr DBA
    “If we ever forget that we are One Nation Under God, then we will be a Nation gone under." - Ronald Reagan

  • I see a lot of new comments have been posted while I was away. Thanks all for the feedback! I think that there's too many of them to address every comment individually, so I'll try to address all the major concerns in one post.

    First, thanks for the many kind words and compliments. I liked them all, especially this one:

    webrunner (4/20/2010)


    This was a good question. It taught me something about identity columns that I didn't know I didn't know, if you know what I mean. 🙂

    It is my firm belief that, while the things you know you don't know can be a nuisance, the things you don't know you don't know are the real danger - so I am very pleased with this feedback!

    On to the more, ahem, constructive;-) feedback.

    SeanLange (4/20/2010)


    I am surprised that nobody complained that the table definition has the last line commented out.

    Sean, and Michael who posted a similar remark - you are both right. I have no idea how this happened. My normal indenting style for CREATE TABLE is to have the closing parenthesis on a line of its own. So this is how the CREATE TABLE statement should have looked:CREATE TABLE Persons

    (PersonID int NOT NULL IDENTITY,

    FirstName nvarchar(40) NOT NULL,

    MiddleInitial char(1) NULL,

    LastName nvarchar(40) NOT NULL,

    DateOfBirth date NULL,

    -- other columns

    );In this case, the code runs (the comma is actually not permitted by a strict interpretation of Books Online, but the parser happens to be forgiving in this case).

    How I ended up moving the closing parenthesis to the preceeding (commented) line, I don't know. My apologies for the lack of proofreading.

    JT-361866 (4/20/2010)


    The first part of the question was fine, but the second "SQL Server does not have an efficient way to retrieve rows based on a known value for PersonID." is not only poorly worded but the answer is just plain wrong.

    SQL Server DOES IN FACT HAVE an efficient way to retrieve rows based on a known value for PersonID.

    Here are three ways to increase the efficiency of retrieving rows based on known values for the PersonID column

    1. Adding a Primary Key to that column

    2. Adding a Unique Index or Unique Constraint to that column

    3. Adding a general Index based on the PersonID column

    The question should have read something like this:

    "By default SQL Server does not have an efficient way to retrieve rows based on a known value for PersonID."

    The question was "which of the following statements are true about this table" [emphasis added], not "which of the following statements are true about this table plus some other DDL". If SQL Server had an auto-tune feature that automatically adds indexes it beliefs will benefit performance, than you would be right. But no current version of SQL Server has such a feature (and based on recommendations I sometimes see from the DTA, that might as well be for the best).

    skjoldtc (4/20/2010)


    Man, oh man. I have been told my soooo many experienced DBAs and trainers that an IDENTITY will NOT be duplicated under any conditions. This is a myth that gets promulgated by trainers and those who should know better. And, it's a very dangerous one. Good thing I put a unique constraint when I use IDENTITY. Thanks for this Hugo.

    I just got a brainstorm (and boy, does it hurt) that someone should write an article about SQL Server myths.

    My first reaction would be to go and find some better trainers. There's plenty of them around who will never tell you nonsense like that. And my second reaction would be to send those trainers my way (looking for an evil smiley, but can't find one)

    As to the brainstorm - do an internet search on "sql server myth" and you'll see that there are plenty of articles on that already. 🙂


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Hugo responded:

    In this case, the code runs (the comma is actually not permitted by a strict interpretation of Books Online, but the parser happens to be forgiving in this case).

    True, the parser is forgiving of the comma, but if it's part of a pre-2008 installation, it will be less forgiving of the last column's type ;-).

  • Yeah, not a fan of the question. Seriously flawed IMHO.

    SQL does have an efficient way... no one is arguing with that.

    Identity is going to auto increment right, it won't allow duplicates... oh unless you tell it to break its own rules. You could also tweak the file using notepad... but why would you want to do that? 🙂

  • pjdiller (4/20/2010)


    Yeah, not a fan of the question. Seriously flawed IMHO.

    SQL does have an efficient way... no one is arguing with that.

    Identity is going to auto increment right, it won't allow duplicates... oh unless you tell it to break its own rules. You could also tweak the file using notepad... but why would you want to do that? 🙂

    It will only prevent duplicates if you have a unique key or primary key on the identity column. An identity column allows you to insert any int value into the field (when defined as int as an example) regardless of the current seed and increment. If you wanted to, you could easily insert 200 records all with the same value for that identity column - if there is no unique key or primary key on it.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Only if you first set IDENTITY_INSERT to ON; the default setting for IDENTITY_INSERT is OFF. . .

  • First, before I go on, let me say that I respect my place here as a "junior member." I appreciate what goes on here, and I consider this my main source for real information beyond BOL.

    Having said that, I am now a bit confused about this IDENTITY question. I created a table in SQL08.

    CREATE TABLE [dbo].[LOGON_AUDIT1](

    [Audit_ID] [int] IDENTITY NOT NULL,

    [Login_Name] [varchar](255) NULL,

    [Client_Host] [varchar](255) NULL,

    [Application_Name] [varchar](255) NULL,

    [Post_Time] [datetime] NULL

    )

    then I inserted a row.

    INSERT INTO [LOGON_AUDIT1]

    ([Login_Name]

    ,[Client_Host]

    ,[Application_Name]

    ,[Post_Time])

    VALUES

    ('joe'

    ,'joe''s host'

    ,'joe''s app'

    ,GETDATE())

    You won't be surprised by the result.

    I then ran this insert, attempting to create a new row with ID 1, which is already used.

    INSERT INTO [LOGON_AUDIT1]

    (Audit_ID,

    [Login_Name]

    ,[Client_Host]

    ,[Application_Name]

    ,[Post_Time])

    VALUES

    (1,

    'joe'

    ,'joe''s host'

    ,'joe''s app'

    ,GETDATE())

    I received an error:

    Msg 544, Level 16, State 1, Line 1

    Cannot insert explicit value for identity column in table 'LOGON_AUDIT1' when IDENTITY_INSERT is set to OFF.

    Now, I'm not convinced that this is the last word, but I wanted to describe the cause for my confusion. How do I create an insert statement that will get past an error like this without changing the default behavior by turning IDENTITY_INSERT to ON? (because it appears that I would have to intentionally turn that on in order to "break the rules" based on my little test)

  • pjdiller (4/20/2010)


    Now, I'm not convinced that this is the last word, but I wanted to describe the cause for my confusion. How do I create an insert statement that will get past an error like this without changing the default behavior by turning IDENTITY_INSERT to ON? (because it appears that I would have to intentionally turn that on in order to "break the rules" based on my little test)

    That is what you would do. The identity would still be present on the table, but you would still be able to insert a duplicate value into the table.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

Viewing 15 posts - 31 through 45 (of 66 total)

You must be logged in to reply to this topic. Login to reply