Data warehouse Test Environment

  • Hi,

    I work for a company that provide services and data solutions, data warehouse/ data mining. Servers held on internally and data warehouse developed internaly for the client, and all installed internally. Some databases hold up to 350gb of data. Each client production environment differ from the other one. One of the production environments consist of: one machine with DATABASE 250gb of data and one machine with services. Machine size can vary from client to client.

    I was tasked to delevelp a test environment. I have a VMWare workstation on my pc. we don't have a test machine.

    Can you please suggest/advise me on how I should go about building the test environment that can take into account all different production environment - and different processes sizes. It would be very expensive to have the same productoin machines in test as some machines have very expensive processors.

    Many Thanks

  • If I understand your requirements correctly...

    You should create just the structure of a customer's db (no data) and create SSIS jobs to populate it by taking a random portion of the production db. (there's a way to select a random sample set per table).

    Or you can just used whatever automation process you have in place and use it update the test db. Have a separate job to remove older or unneccessary data from the test db to keep it within reasonable size.

  • serhan2uk (9/9/2010)


    I work for a company that provide services and data solutions, data warehouse/ data mining. Servers held on internally and data warehouse developed internaly for the client, and all installed internally. Some databases hold up to 350gb of data. Each client production environment differ from the other one. One of the production environments consist of: one machine with DATABASE 250gb of data and one machine with services. Machine size can vary from client to client.

    I was tasked to delevelp a test environment. I have a VMWare workstation on my pc. we don't have a test machine.

    Can you please suggest/advise me on how I should go about building the test environment that can take into account all different production environment - and different processes sizes. It would be very expensive to have the same productoin machines in test as some machines have very expensive processors.

    Why you need a test environment?

    To test, isn't it?

    To test what?

    To test functionality and performance, isn't it?

    A true test environment has to be production size otherwise how can you be sure execution plans from test are going to be the same ones after promoting the code?

    350 Gig is a very small size for a datawarehouse, I will go for production size and refresh via full restore.

    Hope this helps.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • If you haven't already come up with a solution...

    And you just need to examine execution plans of poorly performing queries , see here

    You can just copy over the db schema and statistics, without the copying the actual data.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply