Basic question: Dimension tables - how do you handle changes (i.e. store's manager or floor plan type)?

  • I'm reading Kimball's: The Data Warehouse Toolkit The Complete Guide to Dimensional Modeling (2nd ed).

    So I'm only a little ways in and we are talking about dimensions for a store.

    Store Dimension:

    Store Key (PK)

    Store Name

    ....

    Store Region

    Floor Plan Type

    Store Manager

    .....

    Well, that's going to change over time, store key, name, not too likely.

    Store region, maybe in 10 years they will re-draw... could happen.

    If that DID happen I'd just expect a new row in the store dimension perhaps...

    But what about store manager? That might change every 18 months.

    How do you handle that sort of thing?

    I'm sure the book will get to it, heck maybe 10 pages from now it will!

    But how does a person deal with that sort of infrequently changing dimensional attribute?

    Do you make a NEW row and everything is the same except the Manager?

    How does your database know that the new row is still for the same store, it just has a different manager now?

    What if I wanted to compare that store under Manager Bob and the new Manager Steve?

    (Assuming they both ran it for 1 year exactly and I wanted to compare year to year or something)

  • Maxer (9/6/2011)


    I'm reading Kimball's: The Data Warehouse Toolkit The Complete Guide to Dimensional Modeling (2nd ed).

    So I'm only a little ways in and we are talking about dimensions for a store.

    Store Dimension:

    Store Key (PK)

    Store Name

    ....

    Store Region

    Floor Plan Type

    Store Manager

    .....

    Well, that's going to change over time, store key, name, not too likely.

    Store region, maybe in 10 years they will re-draw... could happen.

    If that DID happen I'd just expect a new row in the store dimension perhaps...

    But what about store manager? That might change every 18 months.

    How do you handle that sort of thing?

    I'm sure the book will get to it, heck maybe 10 pages from now it will!

    But how does a person deal with that sort of infrequently changing dimensional attribute?

    Do you make a NEW row and everything is the same except the Manager?

    How does your database know that the new row is still for the same store, it just has a different manager now?

    What if I wanted to compare that store under Manager Bob and the new Manager Steve?

    (Assuming they both ran it for 1 year exactly and I wanted to compare year to year or something)

    Do you make a NEW row and everything is the same except the Manager?

    Yes, exactly -- this is the definition of the slowly changing dimension. If you want to know the history, you add a new row with the changes (Type 2 change). If you don't care, you just overwrite the attribute with your new info (Type 1 change)

    How does your database know that the new row is still for the same store, it just has a different manager now?

    All of the attributes that relate to the store (other than manager) are still the same Store Name, Store Region, Store Whatever, ...

    You also might want a separate attribute that is current (today's) manager that is separate from manager at time of a sale (or whatever your fact time frame is).

    I realize that those are short answers; it sounds like you are on the right track. The Kimball book is THE starting point.

    HTH,

    Rob

  • rgtft (9/6/2011)


    Maxer (9/6/2011)


    How does your database know that the new row is still for the same store, it just has a different manager now?

    To properly handle SCD of the Type 2 you have to add at least two columns to the target DIMension table, such columns are: DateFrom and DateTo; these two dates on each row tell you the period of time the particular row was "active". Some DBA also add an Active/Inactive flag to each row but in general I prefer to use the DateFrom/DateTo strategy.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • But what about store manager? That might change every 18 months.

    How do you handle that sort of thing?

    18 months is still considered "slow" and can be handled with your standard SCD methods.

    But you should probably read up on "rapidly changing dimensions" which need to be handled differently to avoid performance problems.

    Data Enthusiast | @SQLbyoBI | www.opifexsolutions.com

  • Whether the new manager is tracked with a new row or simply overwritten is not a technical decision, but a business one. There may be lots of reasons to want to track manager performance. A simple overwrite makes it impossible to do this effectively. There would be no way, for example, to know if the store or whatever is being managed is doing better or worse than under the previous manager. Whether it changes every 18 months or so is not the issue. The question is the business value of the information. If providing this adds value to decision makers, than do it that way. You will add value to your product.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply