Help with joins for maintaining history table

  • Hi there,

    I have 3 tables. Step1) Table1 called BulkCopy gets repopulated 1st of the month.

    Step 2) Table2 called staging gets inserted all records from BulkCopy every month. It has a column CheckSumValue (that calculates the hashvalue of all columns).

    Step 3a)Table3 called Master has all new records from staging; Step 3b)any records that got updated (may be address of the Company) will be added as a new record; So, that way we have a history of changes made to the Company

    record.

    CREATE TABLE [dbo].[BulkCopy](

    [CompanyName] [varchar](200) NULL,

    [CompanyNumber] [varchar](200) NULL,

    [AddressLine] [varchar](200) NULL,

    [PostTown] [varchar](200) NULL,

    [Country] [varchar](200) NULL,

    [PostCode] [varchar](200) NULL,

    [CompanyStatus] [varchar](200) NULL,

    ) ON [PRIMARY]

    CREATE TABLE [dbo].[Staging](

    [CompanyName] [varchar](200) NULL,

    [CompanyNumber] [varchar](200) NULL,

    [AddressLine] [varchar](200) NULL,

    [PostTown] [varchar](200) NULL,

    [Country] [varchar](200) NULL,

    [PostCode] [varchar](200) NULL,

    [CompanyStatus] [varchar](200) NULL,

    [CheckSumVal] [int] NULL

    ) ON [PRIMARY]

    CREATE TABLE [dbo].[Master](

    [CompanyName] [varchar](200) NULL,

    [CompanyNumber] [varchar](200) NULL,

    [AddressLine] [varchar](200) NULL,

    [PostTown] [varchar](200) NULL,

    [Country] [varchar](200) NULL,

    [PostCode] [varchar](200) NULL,

    [CompanyStatus] [varchar](200) NULL,

    [CheckSumVal] [int] NULL,

    [CreatedDate] [datetime] NULL,

    [LastUpdated] [datetime] NULL

    ) ON [PRIMARY]

    I know how to do Step 1 & Step2. I do Step 3a using LEFT OUTER JOIN on staging.CompanyNumber = master.CompanyNumber

    where mas.CompanyNumber is NULL. I do not know how to do Step 3b.

    Can anybody help?

    Thx

    Rash

  • Hi - we don't have enough to go on.

    First, you have two tables named "Staging" with two different definitions.

    Second, you have described your left outer join with aliases but we cannot see the underlying SQL to know what tables you are aliasing. Therefore, we may guess but it will be without precision.

    Finally, if you have a table that contains change history, it could be keyed on the company number and the checksum or the company number and the date. Inner join on company numbers equal and checksums not equal to capture all rows that have changes (assuming that the checksum and company number are both trustworthy) and excluding all unchanged rows.

    Thanks

    John.

  • Oops! sorry, I did make the changes to table name & alias

  • rash3554 (3/25/2014)


    Oops! sorry, I did make the changes to table name & alias

    OK That helps. With those changes, then I stand by my previous advice to inner join on the company number equal and checksum not equal and that should produce a delta of changed rows.

    Thanks

    John.

  • Here's what I did

    Month1:

    Insert into Bulkcopy(CompanyName ,CompanyNumber ,

    AddressLine,PostTown,Country,PostCode,CompanyStatus)

    ('A Company','006','2 Main st','White Plains','NY','USA',null,'1')

    Staging Month1:

    Select CompanyName +','+CompanyNumber +','+

    AddressLine+','+PostTown+','+Country+','+PostCode+','+CompanyStatus+','+CheckSumval

    from Staging;

    will yield

    A Company,006,2 Main st,White Plains,NY,USA,null,1,123456

    Master Month1:

    --delta

    insert into Master(CompanyName ,CompanyNumber ,

    AddressLine,PostTown,Country,PostCode,CompanyStatus,CheckSumval,lastupdated)

    select top 10 stg.CompanyName, stg.CompanyNumber, stg.AddressLine, stg.PostTown,

    stg.Country, stg.PostCode, stg.CompanyStatus, stg.CheckSumVal,GETDATE()

    FROM Staging stg (nolock)

    inner join Master mas (nolock)

    ON stg.CompanyNumber = mas.CompanyNumber

    and stg.checksumval!=mas.checksumval

    where CompanyStatus=1

    order by stg.CompanyNumber;

    GIVES NO OUTPUT

    --new records

    insert into Master(CompanyName ,CompanyNumber ,

    AddressLine,PostTown,Country,PostCode,CompanyStatus,CheckSumval,lastupdated)

    select top 10 stg.CompanyName, stg.CompanyNumber, stg.AddressLine, stg.PostTown,

    stg.Country, stg.PostCode, stg.CompanyStatus, stg.CheckSumVal,GETDATE()

    FROM [staging] AS stg WITH (NOLOCK)

    LEFT OUTER JOIN [Master] AS mas WITH (NOLOCK)

    ON stg.CompanyNumber = mas.CompanyNumber

    where mas.CompanyNumber is NULL order by stg.CompanyNumber;

    A Company,006,2 Main st,White Plains,NY,USA,null,1,123456

    Month2

    Insert into Bulkcopy(CompanyName ,CompanyNumber ,

    AddressLine,PostTown,Country,PostCode,CompanyStatus)

    ('A Company','006','2 Main st','White Plains','NY','USA',01849,'1')

    Staging Month2:

    Select CompanyName +','+CompanyNumber +','+

    AddressLine+','+PostTown+','+Country+','+PostCode+','+CompanyStatus+','+CheckSumval

    from Staging;

    will yield

    A Company,006,2 Main st,White Plains,NY,USA,01848,1,222335

    Master Month2:

    --delta 1st run

    A Company,006,2 Main st,White Plains,NY,USA,01848,1,222335 gets inserted

    2nd delta run

    I get Primary key Constraint violation

    How can I resolve this?

    THx

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply