Dirty Pages and Buffer Cache

  • I’m having problems understanding the relationship between checkpoints, buffer cache, dirty pages and transaction logs. Please tell me if the definitions below are correct. I’m most confused over the definition of dirty pages and buffer cache. I’ve read a few books that appear to have slightly different defnitions.

    Dirty Page

    (Definition #1) Any modified pages, residing in the buffer cache, not yet flushed to disk

    (Definition #2) Any uncommitted data residing in buffer cache

    Which is correct? Isn’t a dirty page data that has been modified and committed? I thought uncommitted data was data that has been modified, but the transaction is not complete and the data has not been written to disk.

    Buffer cache

    (Definition #1) Consists of Procedure Cache and Data Cache. The Data Cache portion contains data modification, both clean and dirty, that have not been written to disk

    (Definition #2) Consists of Log and Data cache

    Checkpoint

    Flushes uncommitted data “dirty pages” from the buffer cache and only occurs when the database is in SIMPLE mode.

    Commit Tran

    Writes data changes to the transaction log. (I assume the data also is written to the Buffer Cache)

    When data is inserted, deleted or updated within a BEGIN TRAN – COMMIT TRAN or within a batch string terminated by GO, the data is written to buffer cache. If using FULL recovery mode at what point is the data written to the Transaction Log? Is it first written to the buffer cache and then to the T-Log?

    -- Deletes 100,000 records

    Delete Table1

    From Table1 A, Table2 B

    Where A.PolicyNumber = B.PolicyNumber

    GO

    Checkpoint

    GO

    Since a GO was issued prior to the Checkpoint, isn’t the data considered Clean (“committed”) and therefor a Checkpoint will not reduce the size of the log by the 100,000 records that were deleted?

    Thanks, Dave

  • Dirty Page: Def 1 is correct. They may include uncommitted trans also.

    Buffer Cache: Def 2 is correct. The procs are in procedure cache.

    Checkpoint: Def is correct except, happens in all db recovery modes. This is what helps your database not taking longer on startup (less recovery time).

    Commit tran writes only all the log pages for that tran to disk. The data pages are only flushed when checkpoint is run or more pages need to be loaded into the memory.

    All changes in all recovery modes are written to the log (except bulk changes). The only differnece with simple mode is that they are truncated often (only the committed tran log records are truncated). Log is essential for recovery in any mode. Because you are using simple recovery does not mean you expect your transactions to be lost.

    hope it helps.

  • Thanks for the help. In my Delete statement above, at what point will the 100,000 records be sent to the T-Log? I assume after the first "GO" the data will be committed to the T-Log. If so, will it also exist in the Buffer Cache and then be flushed from cache following the "Checkpoint"?

  • After the first GO, here is what happens at a high level ( I am excluding the compilation steps, locks etc,.since

    that is out of scope for discussion here):

    1. The data is loaded into the memory (buffer pool) for execution. Some of the data may already be in the

    memory (if some body else is reading the same table..)

    2. As each record is changed in the buffer pool, the same is logged into the tlog. So now you have 'dirty'

    buffer pages.

    3. Since you do not have any explicit transaction, the statement by itself constitues a transaction.

    So, the SQL Server proceeds to commit the transaction. To do so, it requests all the log pages be written to

    the disk. Then the transaction is complete.

    4. Remember, nothing has happened to your 'dirty' buffer pages. They are still in memory.

    5. When you ran checkpoint, these dirty pages are flushed to the disk. Had you not ran the checkpoint,

    they would wait there until the system does a checkpoint (which may be a minute) OR some other process needs to

    load int pages into memory and there are not enough buffer pages, so Lazywriter writes those pages to the disk.

    6. Had you been running in simple recovery mode, in a few moments, these log records on the disk would be truncated.

    Other wise, they would stay there until you do a log backup.

    Hope it helps!

  • One last follow-up question to 5 & 6. If running in Simple recovery mode, would the checkpoint move the data directly to the database or would it first be moved to the log? If moved to the log first, at what point does the log truncation occur? I'm trying to determine how to handle large deletes so the logs don't fill. We need to run a one-time cleanup task.

    Thanks again

  • The transaction is committed only after the log pages are written to the disk, no exceptions. All the checkpoint does is it writes the corresponding data pages for those committed transactions (ie those since last checkpoint) to disk. So if you have a long running transaction, your log truncation will not go past that transaction and the log keeps on growing. This is again to preserve the recoverability of that long running transaction. So, it is better to break your long trans into smaller chunks so the log portion is truncated for those committed trans.

    Try something like (psuedo code)

    set rowcount 1000

    declare @rowcount = 1000

    while (@rowcount = 1000)

    begin

    begin tran

    delete ......

    select @rowcount = @@rowcount

    commit tran.

    end

    So each time you are not affecting more than 1000 rows.

  • Hi SXG, one small question..

    quote:


    6. Had you been running in simple recovery mode, in a few moments, these log records on the disk would be truncated.

    Other wise, they would stay there until you do a log backup.


    When does the log records on disk get truncated in a simple recovery mode...

    Cheers!

    Arvind


    Arvind

  • See "Recovery Interval Option" in BOL and the contained links. The documentation states that a checkpoint does not depend on time, but depends on how many records are in the log. SQL Server makes an estimate on how long the recovery will take and issues a checkpoint accordingly. Other events can also cause a checkpoint.

    If the mode is simple or a log backup is done, then the log records of completed transactions can be removed. However, the log file does not shrink. It just will have more space available for future transactions before growing. Trying to shrink the log file might not work without pushing all active records past the end of the last virtual log file. After a log file shrink, SQL Server will pad the last virtual log with dummy records to get future log records to wrap to the start. See “Shrinking the Transaction Log”. I’ve had a tough time getting some log files to shrink.

    Randy
    Helpdesk: Perhaps Im not the only one that does not know what you are doing. 😉

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply