There''s GOTTA Be a Better Way!

  • Ok, I have a table with 18 million rows, it's a log file that our products dump into when they connect to our server

    (we make an internet appliance)

     

    I recently discovered that the archival job that rolls these log files out on a daily basis hadn't run properly in quite some time. As a result, there's 8,000,000 extra rows of data.

     

    I can't just do a "delete from history_log where event_date < '1/8/05'" because it locks that table -- I've got thousands and thousands of units trying to dump into that table at any given time, and if that table's locked, it times out the incoming ftp servers, etc, etc, etc.

     

    So, the solution I'm using?

    set rowcount 100

    delete from history_log where event_date < '1/8/05'

    go

    delete from history_log where event_date < '1/8/05'

    go

    delete from history_log where event_date < '1/8/05'

    go

    (repeat 80,000 times).

    This works. A 100-row delete only takes at best 2-3 seconds, which doesn't affect latency enough to upset anything else down the chain. It just takes for friggin ever (the job started on Sunday. It's down to a million rows left...)

     

    But surely there's a better way to delete HALF a table, quickly? (I can't just drop the table and recreate it with a BCP insert because customer server relies on the data for customer service tech support calls.

     

    Ugh.

     

    Anyway, anyone else have this issue?



    David W. Clary
    MCSD
    Data Miner 49er
    Sr. Database Administrator, Ceiva Logic

  • What about create a new table, insert the relevant records into the new table, stopping at (for arguments sake) 8am today.

    Rename the current table to old and the new table to the current.

    Insert into the current table all records from the old table that are newer than 8am.



    Shamless self promotion - read my blog http://sirsql.net

  • How about a

    select * 
    into history_log_temp
    where event_date > '1/8/05'
    go
    TRUNCATE TABLE history_log
    GO
    INSERT INTO history_log
    SELECT * 
    FROM history_log_temp
    GO
    DROP TABLE history_log_temp
    GO

    Just my $0.02.



    ----------------
    Jim P.

    A little bit of this and a little byte of that can cause bloatware.

  • set rowcount 100

    declare @ctr as int

    declare @TOTctr as int

    select @ctr = 1, @TOTctr = 0

    while @ctr > 0

    begin

        delete from history_log where event_date < '1/8/05'

        set @ctr = @@rowcount

        set @TOTctr =  @TOTctr + @ctr

    end

    print 'Rows Deleted :[' + cast(@TOTctr as varchar(10) + ']'

    go

     

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • I would add a waitfor, to make sure the server is not occupied all the time ... I use this technique a lot. It also helps to avoid a big transaction log ( I have a dump of the log each 30 minutes ) . You can also add a dump tran if needed ....

    set rowcount 100

    declare @ctr as int

    declare @TOTctr as int

    select @ctr = 1, @TOTctr = 0

    while @ctr > 0

    begin

        delete from history_log where event_date < '1/8/05'

        set @ctr = @@rowcount

        set @TOTctr =  @TOTctr + @ctr 

        waitfor delay '00:01:00'

    end

    print 'Rows Deleted :[' + cast(@TOTctr as varchar(10) + ']'

    go

  • ... (repeat 80,000 times).... 

    with waitfor delay '00:01:00' ? 

    Adjust the set rowcount ***

    select dateadd(mi,80000,getdate()) - getdate()

    results : 1900-02-25 13:20:00.000 waitfor-time

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • OK ... it was just only an exemple .. You can play with the rowcount and the waitfor values ...

  • The "Go" does the same thing for me. It executes the line ahead, releases resources, and then runs again. As mentioned, I did not see any significant lag as a result of this method.

     

    Hopefully, this is a once in a jobsite event. My goal's to make sure these kind of databloat events don't occur again.

     

    Great ideas though, especially about recreating the table, inserting the kept information, and then renaming it.

    In your experience, does this cause massive locks when the primary-key/foreign keys are resaved on the interconnected tables?



    David W. Clary
    MCSD
    Data Miner 49er
    Sr. Database Administrator, Ceiva Logic

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply