inserting data from one table to another and want to create primary key at same time

  • The base column would, as I understand it, always come across as-is in terms of it's NULL/NOT NULL property if selected straightaway in a SELECT...INTO. By 'not copying NULLability' I figured you were talking about overriding the base column using an expression, as you showed by wrapping a column in the SELECT-list with ISNULL, which seems to work on all the data types I tried it on. Similarly you can make a NOT NULL column NULL-able by wrapping it with NULLIF.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • ... Similarly you can make a NOT NULL column NULL-able by wrapping it with NULLIF.

    ... or CASE WHEN where relevant.

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • COALESCE too...

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Hi Guys.

    Thanks for all your comments. Anyway going back to the original question will it be a definate that I have to create the table first before selecting into as the source table I will be selecting from will have NULL values.

    It looks like I have to create the table first so my questions is if i do

    create table bla bla bla

    then select * from source table into bla bla bla

    what will happen if it comes across a null value? will everything stop or will it carry on inserting what it can?

    also is there a way to say

    If IS NULL then

    do something?

    else

    carry on inserting

    END IF

  • alan_lynch (3/12/2013)


    Hi Guys.

    Thanks for all your comments. Anyway going back to the original question...

    Yeah, we can get distracted delving too far into the details 😀

    ...will it be a definate that I have to create the table first before selecting into as the source table I will be selecting from will have NULL values.

    Nope. Not necessary. If you have a lot of data try it different ways. Some things to try:

    1. Use SELECT...INTO to create the table and populate it in one step. Then add the PK in a second step. This what the sample code below does.

    2. Try creating the table first and using INSERT INTO...SELECT to populate it. Then add the PK after the data is in the table. If your database is in SIMPLE or BULK_LOGGED recovery mode add the TABLOCK hint to the INSERT INTO...SELECT.

    3. Again create the table first, but immediately add the PK to the empty table. Then use INSERT INTO...SELECT to populate it. With this one also, if your database is in SIMPLE or BULK_LOGGED recovery mode add the TABLOCK hint to the INSERT INTO...SELECT.

    You may be surprised at the outcome. Please post the results if you find the time.

    It looks like I have to create the table first so my questions is if i do

    create table bla bla bla

    then select * from source table into bla bla bla

    what will happen if it comes across a null value? will everything stop or will it carry on inserting what it can?

    It would fail when it comes time to add the PK so you'll want to ignore those rows by using a filter in the SELECT portion of your INSERT INTO...SELECT (see sample code below).

    also is there a way to say

    If IS NULL then

    do something?

    else

    carry on inserting

    END IF

    You can do this using the ISNULL function inline (see sample code), the NULLIF function, the COALESCE function, or a combination of those. If you have a very complex set of logic you want to apply to a single column can use a CASE expression in the SELECT portion of the INSERT INTO...SELECT query to manipulate the outcome based on multiple conditions.

    The code you posted initially was a good start and Jeff extended it to handle the requirements. The new wrinkle you just added about omitting rows with a NULL name is reflected below. I simply modified Jeff's code to add that condition.

    To force your key column to be NOT NULL in your SELECT...INTO you'll use the ISNULL function as Jeff showed, and that plus the filter on the WHERE-clause I just added should allow you to add your PK constraint without the engine complaining about the NULL-ability of the column.

    Here is a working proof of concept with some sample data that I think tests the requirements we have so far:

    USE tempdb;

    IF EXISTS ( SELECT *

    FROM sys.objects

    WHERE object_id = OBJECT_ID(N'dbo.bigtable')

    AND type IN (N'U') )

    DROP TABLE dbo.bigtable;

    GO

    IF EXISTS ( SELECT *

    FROM sys.objects

    WHERE object_id = OBJECT_ID(N'dbo.lookuptable')

    AND type IN (N'U') )

    DROP TABLE dbo.lookuptable;

    GO

    CREATE TABLE dbo.bigtable

    (

    Software_Name_Original VARCHAR(100) NULL,

    Software_Name_Raw VARCHAR(100)

    );

    GO

    INSERT INTO dbo.bigtable

    (Software_Name_Original, Software_Name_Raw)

    VALUES ('Pacman', 'Pacman'),

    ('Ms. Pacman', 'Pacman'),

    ('Dig Dug', 'Dig Dug'),

    (NULL, 'Galaga');

    WITH cteDupeCheck

    AS (

    SELECT RowNum = ROW_NUMBER() OVER (PARTITION BY Software_Name_Raw ORDER BY Software_Name_Raw),

    SN_Original = Software_Name_Original,

    SN_New = Software_Name_Raw

    FROM dbo.BigTable

    WHERE Software_Name_Original IS NOT NULL -- this will guarantee that only rows with values are pulled into 'lookuptable'

    )

    SELECT SN_Original = ISNULL(SN_Original, 0) --ISNULL makes the column NOT NULL

    ,

    SN_New

    INTO dbo.Lookuptable

    FROM cteDupeCheck

    WHERE RowNum = 1;

    --===== Since we now have a NOT NULL column,

    -- add the desired unique clustered index.

    ALTER TABLE dbo.LookUpTable

    ADD CONSTRAINT PK_LookUpTable

    PRIMARY KEY CLUSTERED (SN_Original);

    -- show the results

    SELECT *

    FROM dbo.lookuptable;

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • ok guys this is what I have with an error, see error below

    use TestData

    go

    Drop table lookuptable;

    WITH cteDupeCheck

    AS (

    SELECT RowNum = ROW_NUMBER() OVER (PARTITION BY Software_Name_Raw ORDER BY Software_Name_Raw),

    SN_Original = Software_Name_Original,

    SN_New = Software_Name_Raw,

    SV_New = Software_Version_raw,

    SP_New = Software_Publisher_Raw,

    Licensable = Software_Category,

    MSDN_Flag,

    CDL_Flag,

    Source,

    Pending_Classification_Flag,

    Auto_Classification_Flag,

    Software_Classification_Version,

    Manual_Deletion,

    Load_Date

    FROM dbo.BigTable

    WHERE Software_Name_Original IS NOT NULL -- this will guarantee that only rows with values are pulled into 'lookuptable'

    )

    SELECT SN_Original = ISNULL(SN_Original, 0) --ISNULL makes the column NOT NULL

    ,

    SN_New,

    SV_New,

    SP_New,

    Licensable,

    MSDN_Flag,

    CDL_Flag,

    Source,

    Pending_Classification_Flag,

    Auto_Classification_Flag,

    Software_Classification_Version,

    Manual_Deletion,

    Load_Date

    INTO dbo.Lookuptable

    FROM cteDupeCheck

    WHERE RowNum = 1;

    --===== Since we now have a NOT NULL column,

    -- add the desired unique clustered index.

    ALTER TABLE dbo.LookUpTable

    ADD CONSTRAINT PK_LookUpTable

    PRIMARY KEY CLUSTERED (SN_Original);

    -- show the results

    SELECT *

    FROM dbo.lookuptable;

    ERROR MESSAGE IS AS FOLLOWS

    (19905 row(s) affected)

    Msg 1505, Level 16, State 1, Line 40

    The CREATE UNIQUE INDEX statement terminated because a duplicate key was found for the object name 'dbo.Lookuptable' and the index name 'PK_LookUpTable'. The duplicate key value is (Cover Designer).

    Msg 1750, Level 16, State 0, Line 40

    Could not create constraint. See previous errors.

    The statement has been terminated.

  • I see it. Change this:

    RowNum = ROW_NUMBER() OVER (PARTITION BY Software_Name_Raw ORDER BY Software_Name_Raw),

    to

    RowNum = ROW_NUMBER() OVER (PARTITION BY Software_Name_Original ORDER BY Software_Name_Original),

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • absolutey fantastic

    thanks very much that works

    Alan

Viewing 8 posts - 16 through 22 (of 22 total)

You must be logged in to reply to this topic. Login to reply