Help! Delete operation is VERY SLOW

  • Using Standard Edition of SQL 2K on a backup domain controller server. A DTS package runs on my local NT machine to delete from a large table (30GB, over 25 mil records) old and updated records using a link to the changes table (~1.6 mil recds - more than half are new records). I then add the new and updated records from the changes table. The delete operation is VERY SLOW (takes a day). I'm the only user now. Any suggestion?

    The only indexes on both tables are on the primary keys of both tables which are in primary file group. I also tried executing the query using Query Analyzer where it's still slow. Here's my query.

    DELETE a

    FROM Large_Table a, changes b

    WHERE

    a.mbrshp=b.mbrshp AND a.gst_prof=b.gst_prof AND

    (b.recstatus = 'Changed' or b.recstatus ='Old')

    Primary Key in both tables is mbrshp+gst_prof. And since it's an OLAP setup the fill factor is set to 90%.

    Is it unrealistic to expect it to take no more than a few minutes? Is there a better way? Am I missing something or should the setup be different?

  • I've had luck putting the "keys" into a temp table, then deleting in a seperate command...

    Select mbrshp, gst_prof 
    
    Into #Temp
    From changes
    Where recstatus = 'Changed'
    UNION ALL
    Select mbrshp, gst_prof
    From changes
    Where recstatus = 'Old'

    DELETE a
    FROM Large_Table a, #Temp b
    WHERE
    a.mbrshp=b.mbrshp AND a.gst_prof=b.gst_prof

    or even using 2 DELETEs, one for "changed" and one for "old". You may also want to try having a clustered index on the #Temp either after the SELECT.. INTO... , or

    Create the #Temp table, then make index, then use INSERT INTO ... SELECT FROM Changes to populate.



    Once you understand the BITs, all the pieces come together

  • It may be slow because it is writing everything to a log. What type of recovery do you have on this?

    I had luck with doing a full backup and then setting the Recovery to simple.

    Dr. Peter Venkman: Generally you don't see that kind of behavior in a major appliance.

    Patrick

    Quand on parle du loup, on en voit la queue

  • Thank you folks for your quick feedback.

    I forgot to mention that I already have simple recovery mode and that I was going to try disabling constraints (for the defaulted empty strings) before deleting.

    Anyway, Thomas I tried your suggestion as well and now my situtation is under control. By disbaling the constrints and creating and indexing the #temp table with just the new/changed keys the new delete process from that set only took 6 minutes - a vast improvement.

    Tomorrow I'll check the insert stage which involves more than just the keys what the status is.

    So thanks again.

  • - Log Simple does mean it only keeps the log-data until commit is done.

    - Maybe index b.recstatus and force the use with hint.

    - To keep log-chunks low you could make it transactional (e.g. 50000 deletes per tarn)

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • What is the query plan? Maybe you are getting Nested Loops without a good index? Have you tried ANSI style joins with Index hints as alzdba suggested? What is your index selectivity of your indexes? If it is low you might be able to solve it by switching Column order of your indexes. You could verify this by Profiling or traceing and see the amount of reads your query generates for your deletes.

    Is the PK clustered? How wide is the index? I have a table ~560 million rows ~71Gb inc. 2 NC and one Clustered. It takes around 1 minute for every delete of 1.000.000 records/batch. So the possibility exists to delete quite quickly.

    Cheers, Hans!

  • My experience has been that 2 DELETE queries will run much faster than a single query with a BOOLEAN "OR".  Also, even with the "SIMPLE" recovery method set, the deletes are still logged even if only until a checkpoint is reached.  On large database tables, you may want to "split" or partition the deletes into smaller chunks using some criteria to control the split such as an entry date column.

    Although it may not be an option, I've found that adding more memory to a server with Mega-row tables helps me keep from banging my head on the memory ceiling when such operations as yours are performed.  Hans could probably help you in this area.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • If this is very large, I tend to delete in batches of 100, 1000, whatever works.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply