delete large number of rows

  • I have a table of 2 million rows, and I want to delete 2/3 of it. A delete statement took nearly one hour yet not finish the job. I killed it.

    Next I tried to select the 1/3 of the table rows I want keep into another table, it took minutes, drop the old table, took a minute, and rename the new table. the total process took around 10 minutes.

    The database is in Simple mode, thus no log  is written.

    Why delete is so slow ? Is this common ?

  • Every delete must be placed in the log which longer than to insert only half that amount of data in a new table and rename it.

  • Thank you for reply.

    The database is in Simple Recovery mode. Does delete still be logged ? I did not observe large increase in log file.

  • I'll shut up now before I put my foot too far in my mouth. I'll leave this one to the backup experts.

  • Simple mode doesn't mean no log is written. It bacically means that you are truncating the log on checkpoint. All data modifications still go through the log, it is just truncated after every checkpoint (approximately once a minute, depending on server speed, etc.).

    If you wrap up a large amount of changes inside a transaction, you can watch the log grow. This is because a checkpoint only writes COMMITTED transactions into the database. Here's a thumbnail:

    1) Begin Transaction

    2) Data Modification

    3) Commit transaction

    4) Checkpoint (data is copied from the log into the data files (mdb and/or ndb). The data exists in both the log and data file(s) at this point.

    5) Truncate (delete) all committed transactions out of the log (when in simple mode).

  • you could also try something with smaller units of work to allow the transaction log to 'clean' itself ...

     

    set rowcount 10000                          --> this allows only 10,000 rows to be deleted

    go

    declare @no_rows int

    select @no_rows_left=0

    while @no_rows_left <> 1

    begin

        delete from dbo.your_table_name

            where ...

        select @no_rowsleft=@@rowcount

    end

     

    RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."

  • Thanks guys.

    So it basically the logging process slow down the delete operation.

    Is there a way to turn off transaction log, like one can do in Oracle ?

  • NO. Everything is logged <period> It's just whether or not the data in the log is cleared (dependent on the receovery mode).

    RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."

  • Truncating the table would bypass the recording of every row deleted in the transaction log (and delete all of the rows from the table as well). It would still log that the Truncate occurred though (and any page allocation changes, etc.).

    There isn't a way to turn off the log altogether. It is an integral part of the database.

  • truncatin logs page allocations/deallocation as opposed to rows for a delete. Still logging, just less.

    RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."

  • Right. I thought I said the same.

  • Do you have foreign key constraints? If the table being deleted is a "parent" table, then for every row being deleted SQL will check ALL related child tables to make sure that you are not violating relational integrity. If the child tables do not have indexes on those columns, you'll be running table scans (and most child tables are on the "many" side of one-to-many...)

    If you have foreign keys, do you have cascading deletes? [Biased Opinion Alert] If so, turn them off! [End Biased Opion] If so, then you're deleting from more than one table... and if you've got a series of cascading deletes across multiple tables, you're deleting who knows how much data from who knows where. (Thus my biased opinion on cascading deletes.)

    You cannot truncate a table that's on either end of a foreign key constraint.

    A fussy point on deletes and the transaction log: when in simple mode, the transaction log is not automatically truncated. Transactions are always written in their entirety to the log. If the log file runs out of space, it grows (assuming autogrow is on). If the disk runs out of space, the transaction is cancelled and rolled back and you get a nasty message. Once the transaction is complete and the checkpoint reached, space allcoated within the log file is reused (ye olde circular queue pattern). To regain the disk space, you'd have to us one or another tactic to shrink the log file. (In "full" mode, transaction log file space only gets reused once it is marked as having been backed up.)

    ...the point there is, you should have seen the log file grow. If it didn't, something else may be going on here. (Maybe there were locking/blocking issues?)

    To mention, I've used rudy's tactic in the past for really large deletes, in part because it's faster, with care and planning you can track how things are progressing, and it avoids table locks (which is key if you're pruning a "live" database).

       Philip

     

  • Excellent points Philip. I'd forgotten about some of those hidden 'nasties' lurking in DRI ...

    (I've got to review SQL 201 again !!!)

    RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."

  • i agree truncate... is your friend.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply