datapump is verrrrry slow

  • I'm neither a systems person nor DBA, so please bear with me, if I don't explain this quite right.

    I'm running a datapump from a 3rd party application to upgrade their app from FoxPro to SQL Server 2005. By slow, I mean it's inserting about 70-75 records a minute. I can't do anything to the datapump.

    The SQL Server instance is on a virtual server. The virtual server has 1.6 GB of RAM, the physical server has 2 GB. No other apps are running on the server and the FoxPro .dbf files are on the server. I found the following message in the Event Viewer: A significant part of sql server process memory has been paged out. This may result in a performance degradation. Duration: 7213 seconds. Working set (KB): 6864, committed(KB): 26672, memory utilization: 25%

    The app rep says just to let it run. Taking days to insert a lot of data is normal. His only recommendation was to add more RAM, which I'm told can't be done to the virtual server. It just doesn't seem right to me that trying to insert 153,000 records should need more than 1.6 GB. (In 6 hours the pump has inserted 17,581 records and there's 1 table with over 1.3 million records!)

    This is my 3rd attempt running the pump so I'm getting kinda frustrated. Any suggestions about what to check will be greatly appreciated.

    Thanks,

    Linda

  • Linda,

    That message is ususally cause by an OS trim in memory and if the sql server Edition you have is the Standard Edition you are out of luck. There is no way ( currently ) to prevent that. In my case was usually when large files were coiped in-out of the SQL Server box. But that should have nothing to do with "large" inserts.

    I believe that something else is getting in the way 150000 records is *NOT* a lot.

    Can you check if anything is BLOCKING the pumping process ? ( sp_who2 )


    * Noel

  • I did discover that the test SSIS project I'd meant to turn off was still running, so I stopped that and backed up the db to truncate the logs. That's perked the insert up to about 200 records per minute. 🙂

    I'm not sure what the results of sp_who2 mean. The things using CPU time, besides the data pump, are

    Status Command CPU Time Disk 10

    BACKGROUND RESOURCE MONITOR 2314638 0

    BACKGROUND LAZY WRITER 4029564 0

    SUSPENDED LOG WRITER 80415 0

    BACKGROUND LOCK MONITOR 2934 0

    BACKGROUND TRACE QUEUE TASK 5357 0

    SUSPENDED CHECKPOINT 236029 25247

    Thanks.

    Linda

  • Linda -

    Check your database options and see if switching to "bulk load" helps... depends largely on how/what the "data pump" is but it may help your throughput.

    Joe

  • Firstly, you have to be patient while running anything-sql on a VM. I'm not sure why that guy said that memory cannot be added as VMs are not managing the host server memory like sql instances on a cluster for example, I mean if you add up the memory figures allocated to all the VMs installed on a physical machine, the resultant could be greater than the RAM existing on than machine. I would say to ask again for more memory.

    Otherwise you can help the ETL process by: setting your sql db recovery model to bulk-logged mode if it's not already in simple; temporary disabling triggers on tables that get ins/upd/del; temporary disabling constraints - you would need to take care with this as you may usualy have a staging area where to check the data integrity; temporary disabling non-clustered indexes built on tables that are subject to data load; check for clustered indexes - you may need to have a clustered index on a table that is subject of your load on the sql instance that is defined the same way you are sending data to that table.

    Hope this helps.

  • Thanks for the suggestions. I've changed the db to bulk-load, but don't know if that'll help in the middle of the process. I'll try removing the primary key constraint on this large table since it's an autonumber anyway, but I hate to try messing with other stuff that I don't know how it's supposed to work in a 3rd party application.

    We are going to get some more RAM for the physical machine, so the virtual one can have more.

    Linda

  • I'm by no means making fun of you, Linda... but the polar ice caps are retreating faster than "200 rows per minute" and it's why I don't like things like DTS and SSIS... they're well intended but they're slow... and if not properly exercised, they can be, as you're finding out, horribly slow. Sure, bulk load tasks in either can help, but you won't find anything faster that using "Bulk Insert" and some good ol' fashioned T-SQL to do a merge.

    It DOES take more time to research and write it that way... and, you may have to tell FoxPro to export the data in some format easily understood by Bulk Insert... of course, you're spending quite a bit of time right now, aren't you? 😉

    Another possibility is to simply create a "Linked Server" between SQL Server and FoxPro... that can be nasty fast, as well.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff, I agree there should be a better way, but since the people who own the application like their datapump (I've talked to their tech support), I'm stuck with it. There are some political issues (ugh!) involved, too. At this point I'm just trying to do what I can to get it to finish.

    Linda

  • Linda,

    Make sure your logs are big enough so that you are not getting autogrows... In addition you should also check that "autoshrink" (database property) is also "off"

    I know that is a pain to work with third parties but still 200 rows per minute sound insanely slow.


    * Noel

  • Linda Robinson (10/9/2007)


    Jeff, I agree there should be a better way, but since the people who own the application like their datapump (I've talked to their tech support), I'm stuck with it. There are some political issues (ugh!) involved, too. At this point I'm just trying to do what I can to get it to finish.

    Linda

    Ah... got it... I misread your previous email and thought you made a change to the datapump process. Sorry, Linda.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Make sure your destination table DOES NOT have any indexes


    Jacob Milter

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply