Identifying changed rows in a table

  • Hello,

         I have a problem that I have been trying to figure out the best solution for for a while, I hope I can explain it well enough. I need a method for determining if a record has changed. Basically, our source system does not keep anything like a date changed field for a record. So when I do nightly imports into a reporting datbase, I have to replace everything, both unchanged, changed, and new data.

         Is there (efficiently) a way to compare two tables and update the reporting table if ANY value in the load table has changed, without replacing every record in the reporting table? 

  • The most efficient method is to implement a 'last changed' column in the load table.

    However, it's very likely that this also forces the mechanism that modifies the loadtable to be changed to support this.

    Another option for comparing entire rows may be to use the CHECKSUM() function.

    Look up CHECKSUM in BOL for further info.

    =;o)

    /Kenneth

  • Maybe something like this?

    INSERT INTO tbl_report

    SELECT * FROM tbl_report_refresh r

    WHERE NOT EXISTS

    (SELECT * 

    FROM tbl_report s

    WHERE

    s.col_a =  r.col_a

    AND

    s.col_b = r.col_b

    ETC...

    [font="Courier New"]ZenDada[/font]

  • You don't have to make the application keep track of updates to the table.  Create TRIGGERs to handle the INSERT, UPDATE and DELETE events.  You can maintain a 'shadow' table of changes.  Or perhaps, more efficent would be to log the primary key of the table you are monitoring along with the action and the TimeStamp when it happened. 

    Triggers are very powerful.  Use them sparingly though, they work behind the scenes and can cause confusion due to their stealthy nature.  They can also cause performance problems if not carefully controled and monitored.

  • Thanks for the tips. I can't really change the source tables/database - it's a shrinkwrapped product. Maybe the checksum approach will work out.

  • I can confirm that the CHECKSUM (and or CHECKSUM_BINARY) approach works a treat.

    It possible to create a generic sp that will work for any table with suitable pkeys defined, allowing you to check for updates and apply logic only where necessary.

    Obviously you don't need to use CHECKSUM when looking for inserts or deletes.

    Using triggers to keep a audit of changes should work just fine, but I suspect will lead to a more complex and less robust solution. Depending upon the usage levels of the tables (i.e. lot of updates/only a few updates per sec etc.) however, the overall processing effort may be lower using triggers.

    If you use the CHECKSUM approach (for updates) and IN's and NOT IN's for inserts and deletes, expect the nightly job to take some time if the tables are large.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply