Fast ways to load data from SQL server to SQL server

  • george_at_sql - Wednesday, June 13, 2018 7:44 PM

    I fiddled around with the rows per batch from 10000 to 50000.
    The maximum insert commit size was kept at the default value of 2147483647
    However it didnt help much, in decreasing the loading time

    You need to modify the commit size - by leaving it at the default you are telling the process to load all data and then commit in a single transaction.  If you set that value to 50000 - you are telling the process to commit the data every 50000 rows.

    What the ideal value will be - is determined by the CPU, memory, IO and size of the rows.  This is true of any insert on that system - at a certain size (row size - number of rows) the commit can and will take excessively longer and longer.  This will occur regardless of whether or not you are using SSIS, BCP, BULK INSERT or a linked server.

    Once you reach that 'tipping' point - the only real option is to batch the insert operation.  Whether that is a manual process you create in code or setting appropriate switches in BCP, BULK INSERT or setting commit size in SSIS - it is the process of batching the inserts that will reduce the overall time it takes to complete the process.  This may not improve the performance very much - depending on the type of data being transferred, and if that doesn't improve the performance then you have identified the limitations.

    If the processing is constrained by the network - then that is going to be the limitation and there isn't much you can do to improve that performance.

    Jeffrey Williams
    Problems are opportunities brilliantly disguised as insurmountable obstacles.

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Jeff Moden - Thursday, June 14, 2018 11:51 AM

    Correct  but, the one thing that people forget is that if you're using the network, then you're using the network.  I've seen people use BCP to save a file and then the copy the file to the other computer and use BCP to import that.  They might even "get the religion" and do the BCP out and then use that same file to do a BCP in to save on the copy step but... you're still transferring data between machines across the same network that the linked server uses.

    For linked servers, I've found it to be more dependent on the driver being used and how it's being used.  People try to do a whole lot of things like joining to the remote server across the linked server and that's just not going to be as fast as a single table "go get what I need" query across the linked server.

    I believe that performance challenged code is the usual problem for linked servers and not the linked servers themselves (provided that they correctly configured).

    I have to agree - nothing wrong with linked servers in an of themselves, however - I have seen too many times where the queries being executed are joining a local table to the remote table and SQL Server cannot figure it out, so punts and creates a cursor to fetch one row at a time.  Or worse - SQL Server attempts to pull the 100,000,000 million+ rows across the linked server to a local worktable and then performs the join and applies the where clause.

    If your query is referencing all remote tables into a local temp table - you can generally get decent performance (depending on the remote systems indexes/statistics - of course).  The only time this breaks down that I have seen is where you run out of transaction log space and/or tempdb space because the query is selecting too much data and the local system doesn't have the necessary resources to support it.

    Jeffrey Williams
    Problems are opportunities brilliantly disguised as insurmountable obstacles.

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Jeffrey Williams 3188 - Thursday, June 14, 2018 3:47 PM

    Jeff Moden - Thursday, June 14, 2018 11:51 AM

    Correct  but, the one thing that people forget is that if you're using the network, then you're using the network.  I've seen people use BCP to save a file and then the copy the file to the other computer and use BCP to import that.  They might even "get the religion" and do the BCP out and then use that same file to do a BCP in to save on the copy step but... you're still transferring data between machines across the same network that the linked server uses.

    For linked servers, I've found it to be more dependent on the driver being used and how it's being used.  People try to do a whole lot of things like joining to the remote server across the linked server and that's just not going to be as fast as a single table "go get what I need" query across the linked server.

    I believe that performance challenged code is the usual problem for linked servers and not the linked servers themselves (provided that they correctly configured).

    I have to agree - nothing wrong with linked servers in an of themselves, however - I have seen too many times where the queries being executed are joining a local table to the remote table and SQL Server cannot figure it out, so punts and creates a cursor to fetch one row at a time.  Or worse - SQL Server attempts to pull the 100,000,000 million+ rows across the linked server to a local worktable and then performs the join and applies the where clause.

    If your query is referencing all remote tables into a local temp table - you can generally get decent performance (depending on the remote systems indexes/statistics - of course).  The only time this breaks down that I have seen is where you run out of transaction log space and/or tempdb space because the query is selecting too much data and the local system doesn't have the necessary resources to support it.

    Totally agreed on all fronts there.  As the old saying goes, "It all in the wrist". 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 3 posts - 16 through 17 (of 17 total)

You must be logged in to reply to this topic. Login to reply