How do I get SSIS to parallel processing my package

  • Hi, maybe an easy 1.

    I got 2 database, the first has the a table1 with columns 1 - 20 all defined as varchar. In the second DB the same table1 but columns 1 - 20 now is defined into their correct datatypes i.e. numerics, datetimes, char, varchar etc.

    My SSIS package has an OLE DB source, then a script component which does the conversion and finally the OLE DB dest again. Everythings works perfectly except that I know I can speed it up with the hardware that I have avaiable.

    I have a 8 CPU with 8 GIG ram, but if I look at the SSIS 2005 running, I see that only 1 CPU is used at any time.

    To extract from the first table and insert into the second table takes upto 4 hours on the job seeing that I have about 110 mil records in the table.

    How do I split the SSIS package in order to start using the other CPU's as well.

    (I can use, view, select statements or what ever I want, in order to retrieve the data to insert)

    thanks

    Chris

  • If you really wanted to split the processing into multiple data flows, you could just copy your current data flow and filter the source data so they included mutually-exclusive data sets. If there is no control flow connector between the two data flow components, they will run in parallel.

    You may run into issues with this if the destination components do table or page locks. You may actually end up slowing things down.

    The number of records you are dealing with should not be much of a problem. I would recommend you really search for the bottleneck before assuming it is a problem that can be "threaded away". The first thing I would check is your script component. Try running just a table copy from one server to the other. If this is really fast, then you may need to spend some time optimizing your code before going in the direction of trying to get your processing to thread.

  • Your problem is not that only one CPU is used. Your problem is that you use OLEDB destination component. This is the slowest possible way to insert. What is the destination database? You have to use a bulk-load component and you will easily get 10-30x speed increase.

    ---
    SSIS Tasks Components Scripts Services | http://www.cozyroc.com/

  • Thanks Micheal

    Makes sense in what you are saying, but I do think that I should try the parallel approach.

    My script component actually does quite a bit of processing seeing that the data is not up to scratch.

    But I can only try and see what happens.

    thx

  • Thanks for the reply Cozyroc

    It is from a the same SQL Server 2005 machine, just from 1 DB to another DB (both sql 2005). I've never used Bulk insert task but just quickly checking on it, i see that it is from flat file to DB or can you do tbl to tbl with this component?

    Also I could really use this component for my initial load to the first (varchar) table, but currently the speed I getting loading from flat file to tbl is really not to bad, but this component is some to keep in mind.

  • You can do DB to DB. There is no limitation. Give it a try. It will probably take not more than 10 mins to setup. Post back your results.

    ---
    SSIS Tasks Components Scripts Services | http://www.cozyroc.com/

  • Michael Earl (11/26/2008)


    If you really wanted to split the processing into multiple data flows, you could just copy your current data flow...

    Just a quick FYI on copying... I have read about quite a few issues, and have experienced a few of my own, when copy/pasting transformations (particularly lookup transformations). It's still unclear to me why it happens, but often when a lookup transform is copied it will generate nonspecific errors while executing. It can be corrected by deleting and recreating from scratch the transform causing the problem.

    Tim Mitchell, Microsoft Data Platform MVP
    Data Warehouse and ETL Consultant
    TimMitchell.net | @Tim_Mitchell | Tyleris.com
    ETL Best Practices

  • Hi Tim

    Just to add, I've also experienced various problems with copying of components and specially the script component. I've re-installed VS on my PC about 4 times in the past 4 months, all because the the script component giving 'unspecified errors'.

    My current solution which is working for more than a month now with out any hick-ups is to NOT install the service packs on Visual Studio. (maybe a solution for these problems)

  • CozyRoc (11/26/2008)


    You can do DB to DB. There is no limitation. Give it a try. It will probably take not more than 10 mins to setup. Post back your results.

    Hi again

    When I look at the component I only get flat file as the source connection on the Bulk Insert Task. How do I specify that I want to got from DB Tbl to DB Tbl?

    Thx

  • I have also had some copy and paste issues.

    Copying and pasting in the data flow has been pretty sketchy for me. Copying and pasing components in the control flow (including entire data flows) has usually been ok.

    I do like to paste into a sequence container or something that is different than the original location. If SSIS does not have to rename anything, it seems to work a bit better with the paste.

  • I don't know what component you are looking at, but what you have to do is insert Data Flow Task and then check your destination components for: SQL Server Destination. This is the bulk-load component you have to use.

    ---
    SSIS Tasks Components Scripts Services | http://www.cozyroc.com/

  • I was looking at the Bulk Insert Task which is found on the Control Flow. Got it now, will see what is the difference

  • Keep in mind that the SQL Server Destination component will only work if the package is running on the destination server. It is a local-only destination component because of the way it uses the file system to bulk insert.

    The OLEDB Destination component has a couple of "Fast Load" modes that are similar to (or may even use, I cannot remember) bcp. Depending on your server layout, you may have no choice.

  • CozyRoc (11/26/2008)


    Your problem is not that only one CPU is used. Your problem is that you use OLEDB destination component. This is the slowest possible way to insert. What is the destination database? You have to use a bulk-load component and you will easily get 10-30x speed increase.

    I know this is going to go against everything that people have experienced, but I get the following results:

    Running from OLE DB Source via script to OLE Destination I inserted 137 mil records into the table in 4 hours. (About 9,500 rec/sec) (The original package)

    Running a copy of this SSIS package but changing it to SQL Server destination, I inserted 15 mil records in 44 mins. (Only about 6,000 rec/sec).

    Everything is the same except for the destination in the second package.

    Is there something that I can check to make sure that I get better performance out of SSIS.

    Just to add, I dont have a lot of time to play around and try various options, but if you can point me in a direction, it would be greately appreciated.

    Chris

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply