SSIS Scalability and Parallelism

  • Hi ...

    What do you guys think about the performance of SSIS 2005 as an ETL tool compared to the likes of DataStage and Informatica in a hetrogenous environment (DB2, SQL Server 2000, SQL Server 2005, Teradata)?

    I have not had much experience with the latter tools but for one thing, I know that they are quite expensive and still require an additional database as a respository. I have a reasonable amount of experiance with SSIS and I think it is brilliant. As for scalability, I think it all depends on how you develop the ETL solution. Popular opinion, here, is that the prallelism of SSIS is very limitted and cannot compare to that of DataStage and Informatica. I am not convinced.

    Its also been said that SSIS cannot utilise more than 2 processors. Is that true? Personally I feel that if ones processes are developed correctly and one utilises the power of the central RDMS correctly, which is bound to be a powerful machine, then the number of processors that SSIS can utilise, is irrelevant.

    Does anyone agree/disagree wiith my opinion?

    Ciao

  • SSIS 2005 is a generation 1 product. Informatica and DataStage have been around for awhile. They currently perform better than SSIS. Most of this is simpl due to the product maturity. SSIS 2008 performance has been improved. I have not had time to benchmark anything yet, but they re-worked a lot of the threading and it is supposed to be improved quite a bit, but it will probably still be a lot slower than a much more mature product. You will also find it more buggy, a bit less flexible, etc.

    As far as the use of multiple processors. I have it running on 16-core systems and it is capable of using all of the cores. I wish there was a bit more flexibility in the ability to limit it's resource use. You do have to build packages with multi-threading in mind, and if there is a local SQL Server, offloading some of the work to the MSSQL Engine and using stored procedures can give you a real advantage.

    My guess is that over time, the MS product will eventually overtake the competition. They have more resources then everyone else and the SQL Server family of products is an area that they are currently putting a lot of focus on. There are currently areas that need a lot of improvement - so, for now, the big ETL products available still have their places in the market and at times will be necessary.

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply