Advice on Reducing Database size

  • Earlier in this thread, you indicated that you are "hopeful" about backups and that you think your backup strategy is "snapshots of the virtual machines" the databases are hosted on. I know this is not the reason for your post, but I would caution you about this approach and suggest you make sure this actually works.

    I am not an expert on virtualization, but I do know that I insist upon having DATABASE BACKUPS- not backups of data files, not snapshots of virtual machines- database backups. Database backups are designed for transactional consistency- meaning I can recover from them and have a database that is useful. Backups of datafiles which are in use do not provide this. I'm not sure about VM snapshots.

  • Two things strike me reading this:

    1. I presume you're not actually out of disk space yet, so what will you gain from actually shrinking the data file? By deleting the data you'll still have plenty of room for growth within the existing file - the file shrink just seems unnecessary to me, unless you need to add another database to that drive?

    2. You mention that "The nature of our business is a lot our data doesn't need to be kept". Clearly, though, you are keeping it for the majority of the time, which leads to the situation you're in now. If you have the option (and I appreciate that you may not, especially if you use 3rd-party apps) I'd look into ways of archiving data on a regular, routine process. That way, the purges that you do regularly will be much smaller and far less painful.

  • You need to get with your records retention person and set a retention schedule. Then write SPs that can be used to purge that data in setable segments. Schedule it in SQLAgent, followed by a shrink and reindex. 🙂

    You should be golden.

    The first line of code starts with coffee. The last line ends with alcohol.

  • This won't reduce your file sizes now, but it may help avoid problems in the future:

    I would recommend making sure your auto-growth settings are appropriate to your database size and growth needs. 10% growth on a file which starts at 100Mb looks a whole lot different when that file has grown to 100Gb.

  • Well said Masakatsu!!!

    The first line of code starts with coffee. The last line ends with alcohol...

    😉

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • Philip Yale-193937 (9/19/2011)


    Two things strike me reading this:

    1. I presume you're not actually out of disk space yet, so what will you gain from actually shrinking the data file? By deleting the data you'll still have plenty of room for growth within the existing file - the file shrink just seems unnecessary to me, unless you need to add another database to that drive?

    2. You mention that "The nature of our business is a lot our data doesn't need to be kept". Clearly, though, you are keeping it for the majority of the time, which leads to the situation you're in now. If you have the option (and I appreciate that you may not, especially if you use 3rd-party apps) I'd look into ways of archiving data on a regular, routine process. That way, the purges that you do regularly will be much smaller and far less painful.

    Hello Philip,

    I think you've hit the nail on the head here. I've gone and had a look at the initial sizes of the files (both log and data) and it seems that they haven't grown much beyond their initial size. Therefore I'm going to adopt a similar policy to what you suggest. It's what I was planning to do anyway.

    It seems that if I delete small amounts at a time and keep an eye on the log file, then I don't need to touch the data files, which shrinking won't alter too much. I make that assumption because I think that using DBCC ShrinkFile won't reduce a file below it's initial size.

    Once again, thanks to all the posts and helpful advice.

  • Daniel Smith-480684 (9/21/2011)


    I've gone and had a look at the initial sizes of the files (both log and data) and it seems that they haven't grown much beyond their initial size. Therefore I'm going to adopt a similar policy to what you suggest. It's what I was planning to do anyway.

    Be careful with this assumption. Initial file sizes can be changed (increased) through a number of methods. I've even caught SQL Server 2005 doing an auto-adjust once or twice (I think it was the log file, though), but I never followed up to see what caused the auto-adjust because I was too busy with other things.

    But the rest of your post follows a good thought process.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Daniel Smith-480684 (9/21/2011)I make that assumption because I think that using DBCC ShrinkFile won't reduce a file below it's initial size.

    ShrinkFile will reduce a file below its original size if you actually state what you want the final size to be:

    e.g. If your file was created with an initial size of 500Mb:

    sp_helpfile (to display the file IDs)

    DBCC SHRINKFILE (<file ID>, 100Mb) will reduce it to 100 Mb

    However, as I said before, unless there's a specific need to physically reduce the size of the file on disk, I wouldn't recommend doing this on a data file.

  • Reevaluate indexes. You may be carrying unnecessary baggage.

    Run a script to start with, find duplicate indexes. Drop them.

    Then look at the usage of remaining indexes.

    If you have only updateson the indexes and no seeks or lookups, and the index is not the PK, then pointless to keep them. Drop these.

    Depending on how many columns the index contains, you may be able to regain many gigs of space.

    Catch-all queries done right [/url]
    Gail Shaw's Performance Blog[/url]

  • That's a great point...I've come across this many times in our environment. Quite often we'll have an index that was created ages ago for a report that's no longer used, or for a reports that's "annual"...so we'll sometimes script out the index, drop it to free up the space, then add the DDL to the actual reporting stored proc and create it when it's needed (then dropping it again)

    Obviously it takes longer to run the report this way, but sometimes that's the lesser of the two evils!

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • MadTester (9/22/2011)


    Reevaluate indexes. You may be carrying unnecessary baggage.

    Run a script to start with, find duplicate indexes. Drop them.

    Then look at the usage of remaining indexes.

    If you have only updateson the indexes and no seeks or lookups, and the index is not the PK, then pointless to keep them. Drop these.

    Depending on how many columns the index contains, you may be able to regain many gigs of space.

    That's a very interesting point and worth investigating. I was looking at some of the indexes on Friday; one of which has two columns that are bits. Is it worth having these in the index?

  • Brandie Tarvin (9/21/2011)


    Be careful with this assumption. Initial file sizes can be changed (increased) through a number of methods. I've even caught SQL Server 2005 doing an auto-adjust once or twice (I think it was the log file, though), but I never followed up to see what caused the auto-adjust because I was too busy with other things.

    But the rest of your post follows a good thought process.

    Hi Brandie,

    I based my assumption on three months of watching (I'm a cautious fellow), but thanks for pointing out that initial file sizes can be changed. It's something to look out for.

    Dan.

  • That's a very interesting point and worth investigating. I was looking at some of the indexes on Friday; one of which has two columns that are bits. Is it worth having these in the index?

    There's a hint in normally every text book. Never index the gender column.

    What you need to do is find all the procs using that table and look in the procs to see if there are queries which use all the columns in the index, in the order they exist in the index.

    I think it is something like -- select text from syscomments c inner join sysobjects o on o.id = c.id (either id or object_id) where text like '%tablename%' to find the objects which use tables.

    All depends on cardinality as well. If the optimizer estimates that it has to jump from index to table a number of times above a threshold, it will use a table scan.

    So your query has to be specific enough when creating an index for it. And of course choosing which indexes to keep.

    My number one rule of getting rid of dead weight is, if there is an index on RID, and RID is unique, Never create a composite index which has RID.

    You'd be surprised what the Tuning(2005) advisor creates.

    So to finally answer your questions. If the index only has a bit field, it will probably never be used. since it will probably return half the table.

    If the composite index is there to satisfy a specific high cost query, and the cardinality is high then it will be used.

    What I do in my QA environment is to delete all indexes from a certain table, then use the application and monitor the DMV's, and slowly add the needed indexes one by one.

    And sometimes I notice that the composite indexes were just overkill.

    Catch-all queries done right [/url]
    Gail Shaw's Performance Blog[/url]

Viewing 13 posts - 16 through 27 (of 27 total)

You must be logged in to reply to this topic. Login to reply