Unique Identifiers

  • Has anyone had the need to maintain unique identifiers across different sql instances. Any ideas how this can be accomplished?

    ex:

    tbl_Account has account_id as identity column, however, i need to maintain account_id from instance_1 and instance_2 and be able to identify which instance generated it.

  • SA-1 (8/6/2012)


    Has anyone had the need to maintain unique identifiers across different sql instances. Any ideas how this can be accomplished?

    ex:

    tbl_Account has account_id as identity column, however, i need to maintain account_id from instance_1 and instance_2 and be able to identify which instance generated it.

    Well, I think the easiest thing to do is make 1 instance start at 1 and seed by 2, and have the other start at 2 and seed by 2. Odds and evens.

    Jared
    CE - Microsoft

  • The issue is when that instance_1 is production only and instance_2 needs both prod and test data. There is the need to maintain the prod ids in instance_2 with the flexibility to generate test data and delete it when testing completes.

    For instance, the state of tbl_account at different times...

    10:00 AM

    Instance_1

    1 - Prod

    2 - Prod

    3 - Prod

    4 - Prod

    10:15AM

    Instance_2

    1 - Prod

    2 - Prod

    3 - Prod

    4 - Prod

    5 - Test

    6 - Test

    10:15 AM

    Instance_1

    1 - Prod

    2 - Prod

    3 - Prod

    4 - Prod

    5 - Prod

    Now my data in instance_1 and instance_2 is out of sync. Moreover, If i delete all test data from instance_2 and load prod data from instance_1 my account_id 5 in instance_1 will be account_7 in instance_2.

  • SA-1 (8/6/2012)


    The issue is when that instance_1 is production only and instance_2 needs both prod and test data. There is the need to maintain the prod ids in instance_2 with the flexibility to generate test data and delete it when testing completes.

    For instance, the state of tbl_account at different times...

    10:00 AM

    Instance_1

    1 - Prod

    2 - Prod

    3 - Prod

    4 - Prod

    10:15AM

    Instance_2

    1 - Prod

    2 - Prod

    3 - Prod

    4 - Prod

    5 - Test

    6 - Test

    10:15 AM

    Instance_1

    1 - Prod

    2 - Prod

    3 - Prod

    4 - Prod

    5 - Prod

    Now my data in instance_1 and instance_2 is out of sync. Moreover, If i delete all test data from instance_2 and load prod data from instance_1 my account_id 5 in instance_1 will be account_7 in instance_2.

    Well, there are several ways to do this. Let me ask first, how are you "replicating" data from production to dev?

    Jared
    CE - Microsoft

  • Maintaining unique identifiers across computers is pretty much what GUIDs were invented for.

    Take a look at GUID and UNIQUEIDENTIFIER online. You'll find the relevant data easily enough.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (8/6/2012)


    Maintaining unique identifiers across computers is pretty much what GUIDs were invented for.

    Take a look at GUID and UNIQUEIDENTIFIER online. You'll find the relevant data easily enough.

    True, but being that this column is account_id, I would assume that the value generated is somewhat logical to an account; i.e. 10000 versus FF79CF4D-E023-40E1-A1EC-54321751D4F7. If the test data does not need to be representative of prod data, you could seed the dev table at 200000 or something and use the NOT FOR REPLICATION option, and replicate production data over. Or, again, seed odds and evens. Final option, don't replicate data... just restore the production backup to dev as needed.

    Jared
    CE - Microsoft

  • In our case, Uniqueidentifier / guid's won't work for account_ids.

    We could either restore from backup or replicate data or sync using SSIS to instance_2.

    I think this should work:

    Use Identity column for account_id in instance_1 and instance_2, however, while inserting test data in instance_2, insert negative account_id by enabling identity inserts to on.

    I could delete the negative id's and still not impact the identity seed for instance_1 data.

    Any thoughts on this?

  • If odd/even won't work, you can have test data be set up with negative numbers and production with positive numbers.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • SA-1 (8/6/2012)


    In our case, Uniqueidentifier / guid's won't work for account_ids.

    We could either restore from backup or replicate data or sync using SSIS to instance_2.

    I think this should work:

    Use Identity column for account_id in instance_1 and instance_2, however, while inserting test data in instance_2, insert negative account_id by enabling identity inserts to on.

    I could delete the negative id's and still not impact the identity seed for instance_1 data.

    Any thoughts on this?

    It still doesn't solve how you are replicating data. You can still get out of sync easily. One example is an insert and rollback of data in instance 2. Now you just incremented the data, but you will never see the rolled back id. So, you either replicate with the NOT FOR REPLICATION identity and start the seeding on instance 2 at a very high level. Or, just take the identity property completely off of the column. Why go through all of the trouble of letting your test instance increment the identity field when you don't even really want it to?

    Jared
    CE - Microsoft

  • Why go through all of the trouble of letting your test instance increment the identity field when you don't even really want it to?

    The test won't be adding positive identity data, it will instead perform negative inserts in to the identity column.

    1. When the test is complete, you could delete the test data without impacting the positive identity column (the next seed).

    2. If you load data from instance_1 in to instance_2 which is production data, you would automagically assign the next positive identity value in account_id.

  • Why would instance2 (test instance, right?) need to increment with a positive number, ever? Positive numbers on it would be real data (from instance1). Set it for identity insert when you import from instance1, but set it for a seed of -1 and increment of -1, so test data is always negative.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I can see situations where, the nightly production loads may need to be loaded to non-production systems every once in a while. In this case, you would need to generate positive account_id's in instance_2 (test).

    Essentially, you could do it either way, set the seed to (-1,-1) and manually generate the identity production ids or set it to (1,1) and manually generate the test ids.

    I much rather generate the test ids manually, unless in your experience you would recommend it in reverse.

  • SA-1 (8/6/2012)


    I can see situations where, the nightly production loads may need to be loaded to non-production systems every once in a while. In this case, you would need to generate positive account_id's in instance_2 (test).

    Essentially, you could do it either way, set the seed to (-1,-1) and manually generate the identity production ids or set it to (1,1) and manually generate the test ids.

    I much rather generate the test ids manually, unless in your experience you would recommend it in reverse.

    The problem is, these nightly production loads to non-production systems should never use identity in the test to increment. You don't need it to because the data is already generated. The whole point of an identity field is to allow the database engine to generate a sequential number as data is inserted. So, the non-production does not need an identity generated for data where the identity has already bee generated. Does that make sense? If you let the non-production database generate an identity for data that already has it, there are plenty of cases where it may be out of sync when it is done. You are not manually "generating" ids from production, they are already generated. You are simply inserting them.

    Jared
    CE - Microsoft

  • I would agree. However, the nightly process logic will dictate whether it's insert (there by generating a new id) or an update to an existing record.

  • SA-1 (8/6/2012)


    I would agree. However, the nightly process logic will dictate whether it's insert (there by generating a new id) or an update to an existing record.

    Ok, but what I am saying is that if you allow the test server to guess at the identity coming from the prod server and "hope" that they are in sync, you will be n for a world of hurt. You insert data from prod without utilizing the identity property in dev. Either by taking the identity property off completely or using the SET IDENTITY_INSERT tableName ON. Plus, since this will be a job or a stored proc... You don't have to worry about it each time. You write it once, and it is done. Meanwhile, if you need to insert test data with an ad hoc query, you will not have to do a SET IDENTITY_INSERT... every time you only want to insert 1 row of data.

    So, my choice? identity(-1, -1) and turn on identity_insert when migrating data. That is the only way you will guarantee that the id matches between servers for production data.

    Jared
    CE - Microsoft

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply