Really basic architecture

  • Sorry if this is very basic, I'm curious as to best practices with Analyis Services (AS).  I have a couple of very general questions.

    1.  Does AS typically run on a separate stand alone server from the underlying OLTP database?

    2.  Is the AS datasource typically the OLTP database or is it a copy of the OLTP database refreshed on the AS server.

    3.  If the datasource is a copy of the OLTP database, does the copy process typically do some data cleansing, etc

    Francis

  • All 3 questions have 1 answer. It Depends.

    1. If you OLTP server is maxed out (CPU, Memory, I/O ...) then yes buy another license for SQL server and setup another box. Otherwise look at how big your DW will be and how many users will be using it, and how frequently it will be refreshed and determine if you have enough resources left over from the OLTP to make the DW.

    2. I usually reference the OLTP directly or ETL the data into another database and use it.  For my small OLTP db's with few users I just use the source OLTP database. For my big datasets (>40 million rows in fact table), I transform and aggregate the data to a different database and run my DW off of it. If your resources are maxed on the OLTP box, you could buy another SQL Server license, ETL the data to the new box with AS on it and have you DW box completely isolated from you OLTP.

    3. Yes, you should almost always so some data scrubbing, and aggregating of the data that goes into the warehouse. This is always the most difficult part. If the data are really bad, and it is end user error, sometimes I don't scrub and let the users see the crappy data they put into the system to use as a data quality crow bar.  Nulls are my biggest area of scrubbing, leaving them in will reak havoc on dimension tables. Aggregation is also very important. I have a financial transaction datawarehouse, it has over 50 million rows of data. about 15million are reversals of previous transactions. People only want to know totals out of the warehouse, so I aggregate the reversals with the original charges and reduce the size of my warehouse dramatically. This also helps query time as it doesn't have to sum the positives and negatives out everytime the measure is used.

    Hope this helps.

  • Thanks this helps.  Most of the DW books and articles concentrate on the mechanics of AS and not on the overall approach

    Francis

  • Highly reccomend you go out and purchase Ralph Kimballs data warehouse toolkit. It's worth it's weight in gold (OK, maybe not that much but very good)

    I personally do not use OLTP as a data source for AS, I find it better to copy the data into a better cube comfortable architecture such as a data mart.

    HTH

    ------------

    Ray Higdon MCSE, MCDBA, CCNA

  • hear hear on the DW Toolkit.

    We also always try to extract the data to another repository (usually a DM), apart from being more cube friendly, it creates a change buffer for you from the OLTP datasource/s.  Lastly, we are nearly alwaysrunning into issues of OLTP datasources having (too) many rules applied within the programmtic (end-user) layer, which are often too complex to re-apply using T-Sql, hence pulling the data out into a 'staging' area let's you perform many manipulations (if required) prior to pushing it into AS.

    Steve.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply