Inserting only changed into datawarehouse

  • Hello,

    I am Inserting data into tables in a large datawarehouse;

    I want to insert only changed data into the warehouse instead of INSERTING the whole load all OVER AGAIN

    I need idea on how to go about these

    Many thanks

  • Number of options depending on what edition of SQL you have depends what you can do.

    CDC (Change data capture)

    HashBytes on a MD5 hash, if source & target hash dont match then insert or update.

    The use of merge commands, based on the unique constraint of the row, if they dont exist then insert if they do check the other columns for any differences and update where there is.

    Custom triggers which log updates inserts deletes etc which are stored in an audit table and can be played on the warehouse - pretty much like CDC but if you have Standard edition.

  • I think Anthony meant "custom triggers" - Winnie the Pooh's mates were not that hot when it came to database theory, customised or not 😀

    There is another possibility. If your source systems maintain DateCreated/DateModified columns on the tables you are interested in, it should be relatively simple to select rows where date created/modified > [max date created/modified from warehouse] and then do a MERGE to get them into your DW.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Good spot Phil. I shall ammend.

  • Thanks Guys, but we dont have change data capture turned on, I was thinking of stored proc that use a Not exists ..

    ANY IDEAS guys

  • That would not capture updates to existing data.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • anthony.green (7/20/2012)


    Good spot Phil. I shall ammend.

    Amend? :hehe:

  • Yep, not exists would capture new rows based on a suitable identifier, but won't get updates to extant rows.

    Unless that's all you need 🙂

  • Gazareth (7/20/2012)


    anthony.green (7/20/2012)


    Good spot Phil. I shall ammend.

    Amend? :hehe:

    i assumed that was deliberate!

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Phil Parkin (7/20/2012)


    Gazareth (7/20/2012)


    anthony.green (7/20/2012)


    Good spot Phil. I shall ammend.

    Amend? :hehe:

    i assumed that was deliberate!

    Me too. Mostly.

    Muphry's Law always good for a laugh, whether deliberate or not 🙂

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply