Parallel insert in a table

  • Might help if you were to explain what you're doing. using service broker to send dynamic SQL is a truly strange way to use it.

    I can't tell if you're trying to process multiple messages as the same time to insert into the same table, but that's usually not done by sending SQL statements (you'd usually be sending a message payload, and the message content would be getting stored into the table, NOT sending SQL statements and having them execute).

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • I wanted to do the same Matt and as you clarified, I get it that it's not possible. Rather I am taking a new approach now. The whole purpose was to insert records parallel or fastest way possible. So here is my plan:

    Create a partition function with date as partition

    Create a partition scheme with all on primary

    Create the target table as partitioned over the partitioned scheme

    Now run the processing procedures in parallel to populate data into temp tables for different years

    And at end switch table to target partitioned table.

    Do you see any loophole in this ?

  • sqlenthu 89358 (8/26/2016)


    I wanted to do the same Matt and as you clarified, I get it that it's not possible. Rather I am taking a new approach now. The whole purpose was to insert records parallel or fastest way possible. So here is my plan:

    Create a partition function with date as partition

    Create a partition scheme with all on primary

    Create the target table as partitioned over the partitioned scheme

    Now run the processing procedures in parallel to populate data into temp tables for different years

    And at end switch table to target partitioned table.

    Do you see any loophole in this ?

    No loopholes but I still see problems. You're still loading tables and there's a really good chance that they are still being written to the same drive (or striped drives) as the others. I think that adding partitioning is OK if you're going to keep the year-data and you could benefit from greatly reduced index maintenance and maybe even greatly reduced backup requirements if you set the older static years to Read_Only (assuming 1 year per filegroup which also means 1 file per filegroup, in this case), but it's probably not going to be a panacea of performance for doing the initial load because [font="Arial Black"]you just can't change physics[/font]. If you can't guarantee the use of more than 1 read/write head (or 1 set of striped heads), that's going to be the bottleneck no matter how many parallel loads you try and, because of the additional head thrashing it will cause, it's actually likely to be slower than a straight forward serial load.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Hi Jeff, thanks for the explanation. In my table there will be truncate load operation on monthly basis plus only few read operations. I have currently put all the partitions in primary filegroup only. Currently my practice of processing data in parallel for different years in dynamic tables and at the end switching partition to main table is working good. Please share your views on this implementation.

  • sqlenthu 89358 (8/29/2016)


    Hi Jeff, thanks for the explanation. In my table there will be truncate load operation on monthly basis plus only few read operations. I have currently put all the partitions in primary filegroup only. Currently my practice of processing data in parallel for different years in dynamic tables and at the end switching partition to main table is working good. Please share your views on this implementation.

    Thanks for the feedback. The question now is what do you mean by "is working good"? Yep, I know that means that's it's working but what does it do to the log file, how long does it take compared to a minimally logged INSERT...SELECT (with indexes in place, BTW), and when folks do read against it, how long does it take compared to a properly indexed monolithic table?

    Also, if you're going to load separate years to tables and SWITCH them in, have you looked at the relative simplicity of just using a Partitioned View?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Wouldn't just having multiple staging tables or databases across multiple physicals be good enough to parallel load the data and process it before merging into one primary table? You're still going from say 4 lanes to 1 lane, but if the data is processed, refined or summarized in the 4 lanes before merging to 1 lane, there is still augmentation of the workload across those 4 tables that can save you time before hitting the 1 physical table regardless of table partitioning?

Viewing 6 posts - 16 through 20 (of 20 total)

You must be logged in to reply to this topic. Login to reply