Conforming dimensions

  • Hi.

    I'm pretty new to dw and currently having some problems with building conformed dimensions. I'm trying to conform the payment code dimension from 2 different source systems.

    Below is an extract of some of the recs stored in the first source (A):

    paycode description
    10 cheque
    11 cash
    12 direct
    22 internet

    From another source system (B), the recs are as such:

    paycode description
    CH cheque
    CA cash
    AX axs

    My conformed dimension on this payment code would consists of a surrogate key, paycode (natural key) and the payment description. However, seeing that the natural key from both sources are stored differently (one as numerals and one as alpha codes), what is the best way of doing this?

    Do i store natural key as numerals and for payment codes which are available in B but not in A, create a new natural key for that code?

    TIA!

     

     

     

  • Hi Stephanie,

    I would suggest having columns "PaycodeA" and "PaycodeB" in your dimension table, and therefore storing both. This is because there isn't really a 'natural' key - particularly if you have to create one in the warehouse (as in your example of creating new natural keys for codes that are in B but not in A). If you throw away the paycodes from source B, then you won't be able to match fact rows from source B to your dimension.

    What do you think?

    Matt.

  • I agree with Matt, especially because with the expansion in internet banking, it seems to me that neither set of codes is truly exhaustive. What if, down the road, the 'cash' option needs to be expanded into 'hard cash', 'internet cash', etc.? Management will change requirements

  • Ok, that means something like:

    PaycodeKey PaycodeA PaycodeB Description</TD

    1 10 CH cheque
    2 11 CA cash
    3 12 direct
    4 22 intranet
    5 AX axs

    Yes, I think that would seem better. At least would be able to match the codes from the conformed dimension back to the source system easily.

    Thks a lot for the suggestion!

     

  • Hi,

    how about converting your PaycodeA to Char and then concatenate it with PaycodeB, it will give you One Natural Key. This way, you save one column.

    How about it?

    Sami

  • Stephenie,

    Actually what you did makes good sence for the data warehouse.  You don't necessarily want to use natural keys in the data warehouse.  Conforming data means coming up with a single way to look at the same data from disparate data sources that use different codes for the same type of information; source system a: 10 cheque -- source system b: CH cheque.  In the data warehouse you don't want to use both codes, as you would need to know both codes for queries to pull out all information for payments by cheque.  What you came up with creates a mapping between the seperate systems to a common set of codes for the data warehouse.

  • Hi Stephanie,

    My view is that you should always think of the end users first. End users want consistency and a single view of the customer. They don't want to query different codes for different products - it requires too much knowledge. Therefore, you should ask them which of the two types they'd rather query - why would they want both codes? I think they should be steered towards the alpha codes as these are easy to check. It's not intuitive that code 10 is cheque for example.

    Then, assuming the users go for the character codes, convert the numeric codes into the same character codes in the ETL and use numeric surrogate keys as per the data warehousing norm. Hope this helps.

  • I agreed with Alan. It is the 'Transformation' piece in the ETL process. It should be in the meta data table. You should ask the users what they wanted to see in the Data warehouse.

    Meta data table should look something liked

    Paycode VARCHAR

    DWPaycode VARHCHAR

    Paycode DWPaycode

    10 Cheque

    CK

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply