Sparse files in SQL 2005

  • Perhaps somebody else out there can help out with this:

    I currently have a discussion going on with colleagues about use of sparse files in SQL 2005. It is believed by some that SQL 2005 datafiles can become sparse files, after running certain DBCC commands on them.

    I know SQL 2005 snapshots are sparse files, and have also read the following from BOL:

    The following DBCC commands operate on an internal read-only database snapshot that SQL Server Database Engine creates. This prevents blocking and concurrency problems when these commands are executed. For more information, see Database Snapshots.

    DBCC CHECKALLOC

    DBCC CHECKDB

    DBCC CHECKCATALOG

    DBCC CHECKFILEGROUP

    DBCC CHECKTABLE

    When you execute one of these DBCC commands, the Database Engine creates a database snapshot and brings it to a transactionally consistent state. The DBCC command then runs the checks against this snapshot. After the DBCC command is completed, this snapshot is dropped.

    Sometimes an internal database snapshot is not required or cannot be created. When this occurs, the DBCC command executes against the actual database. If the database is online, the DBCC command uses table-locking to ensure the consistency of the objects that it is checking. This behavior is the same as if the WITH TABLOCK option were specified.

    An internal database snapshot is not created when a DBCC command is executed:

    Against master, and the instance of SQL Server is running in single-user mode.

    Against a database other than master, but the database has been put in single-user mode by using the ALTER DATABASE statement.

    Against a read-only database.

    Against a database that has been set in emergency mode by using the ALTER DATABASE statement.

    Against tempdb. In this case, a database snapshot cannot be created because of internal restrictions.

    Using the WITH TABLOCK option. In this case, DBCC honors the request by not creating a database snapshot.

    The DBCC commands use table locks instead of the internal database snapshots when the command is executed against the following:

    A read-only filegroup

    An FAT file system

    A volume that does not support 'named streams'

    A volume that does not support 'alternate streams'

    Does anybody know if it is possible to make mdf's and ldf's sparse?

  • Please, explain what you actually need - in business terms - and what you expect to achieve.

    Generally, in every day use you don't really need any in-depth knowledge of how SQL Server stores data physically, and the vendor can change the physical implementation of an RDBMS at any point (e.g. in a future release), so you shouldn't rely on a specific implementation.

    You're referring to database snapshots, have you read the article on sparse files (in respect to the way database snapshots are implemented) in Books Online?

    http://msdn.microsoft.com/en-us/library/ms187054.aspx

    Perhaps you're wondering how database files are initialized when the database is created or when files and/or filegroups are added to an existing database?

    http://msdn.microsoft.com/en-us/library/ms175935.aspx

    ML

    ---
    Matija Lah, SQL Server MVP
    http://milambda.blogspot.com

  • Actually, I'm trying to find out why, on rare occasions, mdf and ldf files are created as sparse. If I query using:

    fsutil sparse queryflag

    most of the time I get the response "This file is NOT set as sparse", but very occasionally, the db files are sparse. The problem is, this is causing an issue with my backup software, which can't cope with sparse files.

    Every time I create a DB, it's not sparse, but on occasions it ends up as sparse through some process or other, but I have no idea what is causing this.

    Regards,

  • Are you perhaps trying to backup database snapshots files? That's not supported. Does your backup SW support SQL Server 2005? IMHO it should be aware of this limitation.

    ML

    ---
    Matija Lah, SQL Server MVP
    http://milambda.blogspot.com

  • SQL data and log files are not sparse. Snapshot data files are. Make sure you give your snapshot files a distinctive name (not .mdf) so you can easily distinguish them from normal data files.

    What backup software are you using?

    If you find a file that it's complaining about, you can use the sys.master_files view to get info from SQL on what DB it belongs to. There's an is_sparse column in there.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I have definitely seen mdf's as sparse files, determined as such by using fsutil.

    One of the earlier posts led me to http://www.sqlskills.com/blogs/kimberly/2007/03/04/InstantInitializationWhatWhyAndHow.aspx

    which mentions this feature. The comment "file allocation requests can occur instantly - no matter what the file size" makes me wonder whether when Instant Initialization is used it may use sparse files to do this?

  • Steven Cuthbertson (5/22/2008) makes me wonder whether when Instant Initialization is used it may use sparse files to do this?

    I went digging through BOL yesterday after wondering the same thing. I couldn't find a relation between sparse files and instant file initialization. However, a MSDN search turns up the following confirming your argument against your collegues:

    "NTFS sparse file technology is used for database snapshots and online DBCC CHECK* operations. This section provides more detailed information about this technology in SQL Server."

    Page 50 SQLIOBasicsCh2.doc

    I found similar supporting info regarding ONLINE DBCC (using snapshots) on page 8-9 PhysDBStor.doc

    So it appears that ONLINE DBCC oriented operations will create sparse file snapshots to perform thier operations against.

    Is the OP seeing these errors during DBCC CHECK operations?

    Your friendly High-Tech Janitor... 🙂

  • All instant file initialisation means is that the OS doesn't zero out the file during allocation. It means that the file can be allocated with just some changes to the file allocation tables, with very little writing needing to be done to the disk.

    Normally a file is zeroed out during allocation to avoid the possibility of seeing old data and deleted files.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Understood, but I can see how one might wonder if instant file initialization makes use of sparse files...

    Your friendly High-Tech Janitor... 🙂

  • Gordon (5/22/2008)


    Understood, but I can see how one might wonder if instant file initialization makes use of sparse files...

    Yup. Right now I'm wondering what could be going on.

    Steven: Have you considered mailing or sending a PM to Paul Randal? He's a member of this site and used to be on the Storage Engine Team. He wrote parts of DBCC CheckDB. If anyone can tell you for sure about this, he's a good bet.

    Blog here - http://www.sqlskills.com/blogs/paul

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I'm willing to bet (at this point after researching enough) that we're just looking at it too deeply...as my wife is fond of telling me "Stop thinking". 😉

    IMO, there's enough supporting documentation that the OP has the "upper hand" in his argument with his collegues in that certain DBCC operations make use of sparse files and that IFI doesn't use them (sparse files)...

    However, if Paul does chime in, I'd be interested to hear what he's got to say.

    Your friendly High-Tech Janitor... 🙂

  • Either way, I think the OP just needs to exclude database snapshot files and any files created by DBCC operations from being backed up. 🙂

    ML

    ---
    Matija Lah, SQL Server MVP
    http://milambda.blogspot.com

  • Gordon (5/22/2008)


    "NTFS sparse file technology is used for database snapshots and online DBCC CHECK* operations.

    I've seen this in BOL too, but at no point does it suggest that it maked the whole mdf sparse. I've also tried it on a test database (with and without transactions) and at no point does it sparse the files.

    My test DB is only 2GB though and I can't reproduce this on a bigger and/or busier database because I don't have one.

  • More than likely, what's going on is an ONLINE DBCC operation creates a snapshot sparse file for it's operations - this file now becomes part of the database filestructure temporarily and your backup routines happen to catch this sparse file. I'd suggest that you try to change your maintenance schedules so that they don't conflict with each other.

    What backup software are you using, and is it backing up the database live?

    Your friendly High-Tech Janitor... 🙂

  • No, this is definitely not happening, because I have to stop mssql to run fsutil, so nothing is accessing the file, but it is still set to sparse.

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply