Warehouse Naming conventions

  • I realize this is DTS forum but the warehouse groups are sort of limited so I'll post the question here. Im curious if anyone has ideas or comments on naming conventions for warehouse tables. specifically for tables transformed from operations source systems. for example xx_sometable where xx_ indicate the source system.

    Any thoughts?

    SQL 2012 Standard VPS Windows 2012 Server Standard

  • I don't have much experience with this, but what if some warehouse tables came from more than 1 source system table? A naming convention like that could get messy.

  • Hi

     

    We do the following we pull the source data into a transformation database and use the source names as the table names. When we do the transformations we pull the data into our data warehouse and give the table names a more user friendly name. That way you always have an audit of what you pulled into the warehouse, then your naming standard could be something to do with the subject area of data not the tables.

     

    Mike

  • In my experience, the following works very well:

    • Firstly, you should try and separate your staging area into a separate db. This allows you to have a different backup strategy etc for that staging database.
    • If you name your staging tables <source table name>_<descriptive name> for example SIT_SalesItems it allows you to know where exactly the data comes from as well as what is contained within them. If you have many different sources for your data, you could also prefix it with an abbreviation to indicate the source.
    • All fact tables should be called Fact_<description> ie Fact_Sales and dimensions Dim_<description> ie Dim_Customer.

    Of course the most important thing is to be consistent, irrespective of what you decide on as a naming convention. I have found the above to work very well, but see what will work best in your environment, document it and make sure everyone sticks to that same standard.

    Hope this helps,

    Martin

  • We identify sources of data by using a column called SourceSystemID.  This links to the SourceSystem table which lists all our data sources.

    Our data warehouse tables - well, it's more a data repository, but the principle holds - have the SourceSystemID column on them so we can see which records came from which source.

    Because of this, we don't need a naming convention that supports identifying data sources.

  • You really need to establish a standard and document it.  Some would probably use a different prefix for data staging tables like calling one as ds_transaction for data stage transaction or ds_salesperson.

    When it goes to the presentation area of the data warehouse you can use, as given before, a prefix of fact for fact tables or dim for dimension tables or brg for bridge tables.  You also need to establish a standard for aggregate tables like adding a suffix (example: fact_mnth_sales_agg for montly sales fact aggregates).  You may also want to adopt standards like how to shorten names like amount (make it amt), premium (make it prem), and agent (agt).  It can be difficult to manage very long table or field names that's why you need to adopt this abbrevation standards.

    Discuss it with the data warehouse members, document it, and conform to the standards.  When you document it you also create a data dictionary and use it as a map where to find a particular data element.  Make sure this document is updated whenever there are changes to the data warehouse.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply