Help with Data Warehouse

  • Hi guys!

    I work in a company here in Brazil, and we have a Data Warehouse. Well, at the beggining was easy to mantain this database indexed, because it was small, but now a days it has around 600 gb. I have sure it wasn't well modeled/designed to support this amount of data. It has just the primary file/filegroup. The tables aren't partitioned, and some of them (the historical tables) has more than 600 billion rows. Now when I proceed the maintenance plan (reindex for example), it is taking a long time to finish... sometimes I need to stop it. Well, I have some doubts about this cenario:

    -Partitioning the huge tables: Which cares must I need to do this...?

    -Separate the index in others files (other than the tables are) is a good practice?

    -Is there a "special way" to reindex huge table?

    We have SO windows 2003 Server, 32 Gb Ram, SQL Server 2005 Enterprise...

    So, if You help me giving some tips and advices, I really appreciate... I'm not a database administrator yet, but I want to become one, one day...

    Since Now, Thanks

    Rafael Melo/Brazil.

    p.s: sorry my English mistake.

  • rafael_si (5/6/2010)


    Hi guys!

    I work in a company here in Brazil, and we have a Data Warehouse. Well, at the beggining was easy to mantain this database indexed, because it was small, but now a days it has around 600 gb. I have sure it wasn't well modeled/designed to support this amount of data. It has just the primary file/filegroup. The tables aren't partitioned, and some of them (the historical tables) has more than 600 billion rows. Now when I proceed the maintenance plan (reindex for example), it is taking a long time to finish... sometimes I need to stop it. Well, I have some doubts about this cenario:

    -Partitioning the huge tables: Which cares must I need to do this...?

    -Separate the index in others files (other than the tables are) is a good practice?

    It depends. Sometimes it makes sense. Sometimes performance is better to keep them in the same filegroup as the table data. An alternative would be to separate tables into different filegroups and perform maintenance on specific filegroups on specific days.

    -Is there a "special way" to reindex huge table?

    Are you reindexing or doing a re-org. Your huge historical tables may need no reindexing if they are purely historical data (or at least a lot less frequently).

    Datawarehouses are typically more flat in design and may appear less well designed than an optimal OLTP system. Warehouses are also designed to be heavily indexed.

    I would recommend setting up an alternative index maintenance routine to perform either a rebuild or reorg depending on criteria set forth. Check out this script[/url] from Michelle Ufford to reindex / reorg your indexes.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • The good news is that a 600 Gig database is not such a large database - there should be no problems to properly maintain it; providing it is done the right way.

    Just out of curiosity... 600 billion rows in a table? it sounds to me like too many rows to be fit in a 600 Gig database.

    Tell us a little about the architecture... does it follows dimensional modeling? do you have datamarts? how many and how big each one of them?

    What's your current ETL strategy? do you have perhaps a daily window or is it a near-real time DWH?... How big is the daily load?

    Please also list for us your five larger tables in both number of rows and space utilization - kind of two rankings 😉

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • sorry my mistake... The amount of rows of this is 600 million...

    Thanks for your replying....

    Rafael

  • As Paul said, this really isn't too big of a database. Knowing your big tables and where your sticking points are when doing the maintenance would be the keys to solving your situation.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • PaulB-TheOneAndOnly (5/6/2010)


    The good news is that a 600 Gig database is not such a large database - there should be no problems to properly maintain it; providing it is done the right way.

    Just out of curiosity... 600 billion rows in a table? it sounds to me like too many rows to be fit in a 600 Gig database.

    Tell us a little about the architecture... does it follows dimensional modeling? do you have datamarts? how many and how big each one of them?

    What's your current ETL strategy? do you have perhaps a daily window or is it a near-real time DWH?... How big is the daily load?

    Please also list for us your five larger tables in both number of rows and space utilization - kind of two rankings 😉

    Paul, we follow the dimensional modeling. We have datamarts (Dimensions (not big) and fact tables (really "big" because they are historical). We start the process of dataload at midnight and it takes around 3 hours.

    Paul, follow some of my tables and its size..

    Table Number of Rows Data Size Index Size Total

    TBROMANEIO 108924254 52,5267028808 GB 8,0711135864 GB 60,59782 GB

    TBPEDIDO 115847827 22,9033737182 GB 3,5610809326 GB 26,46445 GB

    FCONTAPAGAR 122551064 12,6503829956 GB 0,1271896362 GB 12,77757 GB

    FCONTARECEBER 501094705 136,7632369995 GB 26,5281066894 GB 163,29134 GB

    TBTITULO 536834644 115,4315109252 GB 33,5934906005 GB 149,02500 GB

    regards,

    Rafael Melo

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply