Advice on breaking down large database

  • Hi,

    I'm looking for some design/architecture advice regarding an application that I am currently working with.....

    The database (SQL 2005) only contains 10 tables but its current size (data plus indexes) is over 100GB (currently only contained in the primary filegroup). The majority of the data is primarily contained in 3 tables.

    The database is constantly written to via a windows service that pulls data from a MSMQ. It is a 24 hour realtime application so downtime is not an option but there are 'quiet' times (between 3am and 6am) when queue data is reduced. The data volumes are predicted to increase over the next few months so I would like to implement a partitioning system to break down the data into more manageable chunks to not only increase query times but also to reduce the size of the backup set.

    Query times are 'acceptable' at the moment but these are slowly decreasing over time. Supporting indexes are regularly rebuilt or reorganized depending on their fragmentation level, but with the tables increasing in size every day, I wanted to try and make the dataset more manageable.

    The requirement for the application is to keep 3 months historical data online at any one time, but all reports have to keep the latest data (up to the second). Once tomorrow comes, todays data is now historical and won't change. New data is effectively appended to each table.

    It has been identified that a months data should be effectively self contained, so that a month can then be backed up (only needs to be done once) and dropped out of the database at the appropriate time.

    I have used the new partitioning functionality with a sliding window scenario in SQL Server 2005 in previous projects, but unfortunately, the version of SQL Server concerned this time is Standard Edition and not Enterprise, so frustratingly, the new SQL 2005 partitioning functionality is not an option (?).

    I was considering (although am obviously open to any advice!) of using filegroups to manage this, with a months data being contained in a filegroup. This would then allow a filegroup backup to be performed to enable backing up a months data. So there would effectively be 4 filegroups, 3 containing a previous month each plus a fourth for the current month.e.g. if this were implemented now, there would be a filegroup each containing data for Jan, Feb and Mar and then Aprils data would go into the 4th. When we get to May, January will be dropped off and Mays data will be added etc.

    Clearly there is an issue of initially implementing this retrospectively (and moving data around depending on the solution), but this can be scheduled in. It is the ongoing job to manage the data that I am more concerned with.

    If anyone has any experience of this kind of request or can offer any advice / alternative solutions, it will be appreciated.

    Thanks in advance for any help...

  • One alternative to partitioning is the manual SQL 7 way, using SQL views & the UNION ALL statement.

    http://blogs.conchango.com/christianwade/archive/2004/11/08/221.aspx

    You would create a monthly historical table where records would be moved to. Using the WHILE option to operate against a small subset of records makes sense when transferring this data.

    There is a good thread here for dealing with large numbers of records.

    http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=91179

    WHILE EXISTS ( SELECT * FROM table WHERE )

    BEGIN

    SET ROWCOUNT 1000

    DELETE Table WHERE

    SET ROWCOUNT 0

    ENd

    cheers,

    Andrew

  • Hi Andrew,

    Thanks for the post. I've used the partitioned view approach before but am not sure how this will help me in this situation.

    If I have a monthly historical table, are you suggesting that this is moved to an additional filegroup or resides in the primary filegroup ?

    Ideally, although I need the data on line (3 months historical data plus the current month), I only need to back up a historical month once as it won't change.

    So ideally when a backup is taken, I am only backing up the current month as the data for the 3 historical months have already been backed up.

    That is why I was thinking of filegroups ,so that I can backup a filegroup containing a month at a time.

    Additionally, I was reluctant to move the 'current' months data once a new month ticks over as this will incur a potentially massive overhead.

    Seeing as I have already collected this data, I wanted to try and insert a new months tables and then add that months data to the new tables, rather than moving data around.

    Any ideas welcome.

    Cheers

    Neil

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply