Optimize insertion of multiple rows

  • I am doing many repetitive inserts into a table. I am using ADO.NET and ExecuteNonQuery(). There are about 4,000,000 inserts. Each record has about 150 columns.

    I am just reading from Excel spreadsheets and inserting into SQLServer. This is a one time load and I do not need to worry about multiple users accessing the data.

    It is taking about a day to process.

    Right now I am opening and closing the connection on each insert. I thought that the connection would go back into the pool and the overhead would not be that terrible. Obviously, I have to keep it open for longer, but for how long? Is there a problem with keeping the connection open for hours?

    Should I concatenate inserts in the command text? If so, is there an optimal number of concatenated inserts?

    Should I turn off row and page locking? (there are no other users)

    SET NOCOUNT ON?

    What else can I do to speed the process?

    TIA

    David

  • Have you considered creating a DTS package to get the data in?


    Kindest Regards,

    Michael Page

  • No problem keeping connections open. If they don't use resources (queries), there's very little overhead. Might be sure you have error code in case the connection gets dropped.

    I'd actually consider SSIS / DTS instead if this is regular. You could even programmatically drive the package if you needed to make minor changes.

  • Thanks for the responses.

    DTS would not work for this. The data will come in in different formats and there is some data manipulation involved. There is a base class and derived classes for different formats, directory structures, etc.

    I am not familiar with SSIS, but the application is working and has a simple UI that anyone could use. However, it seems slow...

    Right now I am looking for a way to speed up multiple inserts.

    David

  • I'd seriously consider using SSIS for this, if you're using SQL 2005. It can be set up to read various input file formats, variations in column names, etc. It can have different data transformations based on variables, data in the file being imported, etc. It's a pretty powerful tool for the kind of thing you're doing.

    I can't give any advice on how to make your existing code work better. Outside my field of expertise.

    I can say I've done things that are similar (based on your description), using DTS and SSIS, and they have worked quite well.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • SSIS is certainly worth investigating, but realistically it will not be for this round. It is more cost effective to wait a few more hours for the current application to run that to redo the code.

    We also have other factors that may or may not rule out SSIS. Right now the user only needs to navigate to a root directory through the UI and click go. Internally, a factory object determines the correct class to return (implementing an interface) and the processing begins using that class. Can this outward simplicity and inward complexity be achieved with SSIS?

  • For a 1 time load, I'd export the data to Tab delimited files and Bulk Insert them. Nasty fast...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Take a look at the ADO.NET 2.0 SqlBulkCopy class. I've used if previously and it is much faster than making repeated inserts. Here is a good post with some example code and an explanation about why it is so fast. You will have to write your own code to populate a datatable, and then use sqlbulkcopy to insert data from the datatable into the database.

    http://davidhayden.com/blog/dave/archive/2006/03/08/2877.aspx

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply