Star Schema for Help Desk Analysis

  • I am creating a data mart. We want to use it to analyze help desk tickets. The data will be pulled from our help desk management system. It would be nice if a "second set of eyes" could look at my star schema so far. To start with, does my fact table make sense? Am I headed in the right direction?

    factTickets

    -----------

    Fact_ID

    Fact_Ticket_Number

    minutes_spent

    total_minutes_spent

    created_on

    start_on

    due_on

    Last_Modified_On

    assigned_to_worker_id

    created_by_worker_id

    modified_by_worker_id

    end_user_id

    status_lookup_id

    priority_lookup_id

    urgency_lookup_id

    owner_id

    type_lookup_id

    closecode_id

  • Not knowing your help desk system, it looks okay to me. A couple of comments:

  • I'm not sure I understand the difference between minutes_spent and total_minutes_spent
  • Your FKs to dimensions have _id in their names; I'm assuming created_on and start_on link to your Date dimension
  • I'm not sure I understand the difference between priority and urgency -- but if they are truly two different dimensions in your system, it's fine
  • I usually put _AK (alternate key) on my business key names like Fact_Ticket_Number
  • Good luck on it!

    Rob

  • robert.gerald.taylor (5/25/2012)


    Not knowing your help desk system, it looks okay to me. A couple of comments:

  • I'm not sure I understand the difference between minutes_spent and total_minutes_spent
  • Your FKs to dimensions have _id in their names; I'm assuming created_on and start_on link to your Date dimension
  • I'm not sure I understand the difference between priority and urgency -- but if they are truly two different dimensions in your system, it's fine
  • I usually put _AK (alternate key) on my business key names like Fact_Ticket_Number
  • Good luck on it!

    Rob

    Good idea about the AK. I think I'll add that to my fact table now. Also, I need to add artificial system-generated keys to my tables.

    The help desk system contains one or more steps per help desk ticket. So a ticket might be opened (one row in source system), then a help desk person updates the ticket (second row in source system) and then finally closes the ticket (third row in source system). The "minutes_spent" column is the minutes spent on each ticket update or step. The "total_minutes_spent" column is a running total of time spent on the ticket overall, all of the steps and updates of that ticket. Our help desk system calculates both columns and then stores them in the source database.

    Right now, "created_on" and "start_on" are actual date columns from the source system. I have not created a date dimension yet and I might need help finding an efficient way to build one.

    "Priority" and "Urgency" are two different fields in the front end of our help desk system and two different columns in the source database. I have no idea why.

  • It may be within the requirements for your system. I would, however, consider adding a bucket ID to be able to classify the type of the problem addressed by the ticket. There might be two of these IDs - one at the beginning, and the other at the closing; for example, ticket submitted for a bug (code deficiency) but resolved as a problem with misguiding documentation.

  • Another question: how important is it to use a surrogate or artificial key for the dimension tables? Could I get away with just using the FK from the source system for the dimension tables? If I use a surrogate key, won't I need to use lookups in my SSIS packages?

  • imani_technology (5/25/2012)


    Another question: how important is it to use a surrogate or artificial key for the dimension tables? Could I get away with just using the FK from the source system for the dimension tables? If I use a surrogate key, won't I need to use lookups in my SSIS packages?

    SSIS typically does not deal with dims - packages put data into staging tables and another process transforms them into fact tables and, if and where applicable, updates dim tables.

    In justified cases I would not hesitate to use the original ID for dims; however, in many cases this is impractical, especially if one and the same menas different things to say different divisions of the company. In this case you would have to use composite keys and that would make dims clumsy. Also, there are performance3 considerations: if you use your data mart to build cubes, you get beter performance when your keys are INTs or BIGINTs.

  • Right now, "created_on" and "start_on" are actual date columns from the source system. I have not created a date dimension yet and I might need help finding an efficient way to build one.

    You can find a number or articles and scripts here on SSC about creating a date dimension ( http://qa.sqlservercentral.com/scripts/Data+Warehousing/62844/ ). That's what I've done before; taken someone else's Date dimension build script and tweaked it for my needs.

    HTH,

    Rob

  • Thank you, I will definitely check this out.

  • Another question: how important is it to use a surrogate or artificial key for the dimension tables? Could I get away with just using the FK from the source system for the dimension tables? If I use a surrogate key, won't I need to use lookups in my SSIS packages?

    I would disagree with the answer above and say it's one of the most important things to do. You should ALWAYS use surrogate keys in dimensions. Even if you could make it work this way now, you may change the type of dimension in the future to a type 2, in which case it won't work without this key. So what if you have to use lookups? It's not a big deal.

  • I would disagree with the answer above and say it's one of the most important things to do. You should ALWAYS use surrogate keys in dimensions. Even if you could make it work this way now, you may change the type of dimension in the future to a type 2, in which case it won't work without this key. So what if you have to use lookups? It's not a big deal.

    I agree 100% with SSChasing Mays. Use surrogate keys, always. every time. no exceptions.

  • Viewing 10 posts - 1 through 9 (of 9 total)

    You must be logged in to reply to this topic. Login to reply