Refreshing the data warehouse without hurting anybody

  • Not sure that I agree 100% with that statement Ron. If you incrementally load dimensions and then facts, this scenario (facts pointing to incorrect/non-existent dimension members) will not happen.

    I'm not entirely positive on this, as I'm not sure where the inherent drill through gets it's information. But if it gets it from the underlying tables, as they would in certain configurations, at some point during the load the underlying tables will be out of sync with the previously processed cube. This situation will continue until all the data is in place and the cube is reprocessed. Does that make sense?

  • RonKyle (5/13/2015)


    I'm not entirely positive on this, as I'm not sure where the inherent drill through gets it's information. But if it gets it from the underlying tables, as they would in certain configurations, at some point during the load the underlying tables will be out of sync with the previously processed cube. This situation will continue until all the data is in place and the cube is reprocessed. Does that make sense?

    Makes sense, but you're assuming that all queries against the relational database happens via a drill-through in the cube. Users could make use of other methods to query the underlying database.

  • Users could make use of other methods to query the underlying database

    In other places, perhaps, but I don't allow that.

  • Good luck here. I think you're in a tough spot now, and certainly a "Case" won't necessarily help.

    One thing we did years ago with a DW was have an "as of" table that was joined in with queries. This allowed us to reload partial parts of tables, even rolling back a load to a previous date/load counter and then limiting what all queries could access for data. We could load the new fact data, then dimensions/other tables, and when we were ready, update the as of to include new data in queries. Prevented issues with tools to pull back data that wasn't complete.

    Might see if you can implement something, or get the outsourcers to think along these lines.

  • it shouldn't void any warranty (BWAAAA-HAAAAA-HAAA!!!! Like someone thought of that in the contract!

    Are you psychic? Actually there was one warranty. That was that they'd bill us come rain or hail or driving snow. And in fairness to them they have kept their word at least on that front.

    ...One of the symptoms of an approaching nervous breakdown is the belief that ones work is terribly important.... Bertrand Russell

  • certainly a "Case" won't necessarily help

    Agreed. The case/jury analogy was just to promote discussion in a situation where the contract is silent on certain issues (all contracts are silent on at least some things). It then comes down to what a reasonable and impartial person would expect from a vendor who promised to deliver a DW. A very experienced and successful PM once told me that if you need to reach for the contract to find out where you stand, it's a good sign the working relationship is over.

    Oh... and I almost forgot (this one is a brilliant). They have some funky "automated" way of creating their DWs but it means that most dimensions are basically cookie cutter replicas of each other with repetitive column headers like *_group and *_code. They could only (and I'm not joking when I tell you this) accommodate attributes that exist in the source system! So we said "But what if our business wants to aggregate by groupings that don't exist in the source system?" They said "Oh... our process can't really handle that right now. It's a very unusual request, you understand. So you're sure you need this? Oh Ok. We'll work something out (on your time) and get back to you."

    It's so unusual that Kimball devotes an entire chapter to it in The Microsoft Data Warehouse Toolkit. Hahaha!

    ...One of the symptoms of an approaching nervous breakdown is the belief that ones work is terribly important.... Bertrand Russell

  • GPO (5/13/2015)


    it shouldn't void any warranty (BWAAAA-HAAAAA-HAAA!!!! Like someone thought of that in the contract!

    Are you psychic? Actually there was one warranty. That was that they'd bill us come rain or hail or driving snow. And in fairness to them they have kept their word at least on that front.

    Heh... nah. Experienced. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • The case/jury analogy was just to promote discussion in a situation where the contract is silent on certain issues (all contracts are silent on at least some things).

    If they wrote the contract then any ambiguity would favor you. If both sides were equally represented there should be a clause stating this so that the "reasonable person" view would apply.

  • Oh... and I almost forgot (this one is a brilliant). They have some funky "automated" way of creating their DWs but it means that most dimensions are basically cookie cutter replicas of each other with repetitive column headers like *_group and *_code. They could only (and I'm not joking when I tell you this) accommodate attributes that exist in the source system! So we said "But what if our business wants to aggregate by groupings that don't exist in the source system?" They said "Oh... our process can't really handle that right now. It's a very unusual request, you understand. So you're sure you need this? Oh Ok. We'll work something out (on your time) and get back to you."

    And this is pretty poor. If you can change them, you should.

  • ...

    Oh... and I almost forgot (this one is a brilliant). They have some funky "automated" way of creating their DWs but it means that most dimensions are basically cookie cutter replicas of each other with repetitive column headers like *_group and *_code. They could only (and I'm not joking when I tell you this) accommodate attributes that exist in the source system! So we said "But what if our business wants to aggregate by groupings that don't exist in the source system?" They said "Oh... our process can't really handle that right now. It's a very unusual request, you understand. So you're sure you need this? Oh Ok. We'll work something out (on your time) and get back to you."

    It's so unusual that Kimball devotes an entire chapter to it in The Microsoft Data Warehouse Toolkit. Hahaha!

    Doesn't seem to me that they have even read anything on Kimball. You really have to ask the tough questions during the interview process and not be afraid of making them look bad potentially. It is business, not personal. If they were controlling the interview (as they should not have been, they will make it seem that they can do anything, using generalised statements).

    On the item of data consistency during data loads, if this persents a issue such as experiencing long loading times during regular business hours then I wonder if you could implement a daily database snapshot to report from (this could be a daily delete previous/create new).

    ----------------------------------------------------

Viewing 10 posts - 16 through 24 (of 24 total)

You must be logged in to reply to this topic. Login to reply