Daily Bulk Load

  • I have a db into which 73 files are loaded on a daily basis. There is 1 table for each of the files. Each file contains a complete set of data and can be used for recovery purposes. No updates are performed on the data once it has been loaded into the database. The data is used for reporting purposes only.

    An SSIS package was set up by a previous developer (no longer works here) to load the data into the database. Essentially, the process is:

    For each file

    Truncate Associated Table

    Load the Table With the Latest Data File

    Currently, my mdf and ldf files are 32Gb and 320Gb respectively. This week, I ran out of space on the hard drive.

    I created a copy of the database yesterday and loaded all the data. The copy's mdf and ldf files are only 6Gb and 3Gb respectively. So I know for a fact that I currently need less than 10Gb to store the complete set of data, not 350Gb.

    I've been reading the posts about backups and recovery modes and I've only confused myself further. Any suggestions on how to manage this database, its associated file sizes, and/or the loading process?

    Thanks

  • What recovery model is the database using? I'd read up on them in Books Online and take a close look at why it might be in the one it's currently using. It seems that if this is a database that doesn't change once it's loaded with data that you wouldn't need point in time recovery, and therefore could live with a simple recovery model.

    I'd still take a backup of the database once the data is loaded so that if anything happens during the day you can quickly restore the db without having to reload all of the data. I would imagine it would be faster than re-importing all of the data.

    -Luke.

    To help us help you read this[/url]For better help with performance problems please read this[/url]

  • Luke L (6/4/2009)


    What recovery model is the database using?

    Bulk-logged

    I'd read up on them in Books Online and take a close look at why it might be in the one it's currently using. It seems that if this is a database that doesn't change once it's loaded with data that you wouldn't need point in time recovery, and therefore could live with a simple recovery model.

    I'll change the recovery model to simple.

    I'd still take a backup of the database once the data is loaded so that if anything happens during the day you can quickly restore the db without having to reload all of the data. I would imagine it would be faster than re-importing all of the data.

    Ok.

    But would this help me out with recovering the excessive disk space currently in use by the transaction log?

  • Take a look at the table at the top of this page...

    http://msdn.microsoft.com/en-us/library/ms189275.aspx

    It states..

    Recovery Model: Simple

    Description:

    No log backups.

    Automatically reclaims log space to keep space requirements small, essentially eliminating the need to manage the transaction log space.

    But would this help me out with recovering the excessive disk space currently in use by the transaction log?

    Yes, obviously this is all stuff you should test in a test environment first to make sure you understand what is changing, but it should help with your space issues given that you don't need point in time restores.

    -Luke.

    To help us help you read this[/url]For better help with performance problems please read this[/url]

  • Problem solved.

    Thanks for your help.

  • No Problem, glad you got it sorted.

    -Luke.

    To help us help you read this[/url]For better help with performance problems please read this[/url]

  • I think you need to change your database recovery model to 'Bulk-logged', before you load the files using SSIS. This is use minimum amount of Transaction log space.Once the file is updated, change the recovery model to Full. Initiate a Full backup of the Database, for your disaster recovery plan as you already your T-log chain broken.

    If you are changing the database recovery model to 'Simple' , you will not able to restore till the time of failure. And if your database is "Critical", then 'Simple' recovery model is not advisable. This recovery model is used for Historical database, where you don't need most recent data and the data is not changing frequently.

    If your transaction log files are growing rapidly, it means your application needs that much T-log space. Schedule a job to initiate T-log backup and then shrink your T-log. This will give you the required disk space along with a good recovery option.:-)

    Let me know, if this helps!!!:-)

    Thanks & Regards,
    Sudeepta.
    http://twitter.com/skganguly

  • Perhaps you missed the part where the OP mentioned

    There is 1 table for each of the files. Each file contains a complete set of data and can be used for recovery purposes. No updates are performed on the data once it has been loaded into the database. The data is used for reporting purposes only.

    The bold emphasis is mine...

    Simple recovery is a viable solution here as once the files are loaded no more changes occur to the database. If a full backup is taken directly after the data load, there will be no changes made to the data prior to the next day's data load. They do not need point in time recovery because no changes are made to the database.

    -Luke.

    To help us help you read this[/url]For better help with performance problems please read this[/url]

  • ... once the files are loaded no more changes occur to the database.

    ... there will be no changes made to the data prior to the next day's data load.

    This is correct.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply