ETL

  • Revenant (11/4/2011)


    GSquared (11/4/2011)


    . . . Nothing wrong with building your own .NET solutions, but frequent breaks in an SSIS package should be cause for re-examination of how the package was built and tested, not a rejection of SSIS.

    We are importing tens of millions of rows from flat files. Unfortunately, it may happen that the data contains a newline character and the line is broken into two fragments that cannot be parsed. Then C# comes quite handy: we can find what is the length of the line following the one that failed parsing, and if the length indicates that indeed this is the case, we join the two segments and reparse.

    Trivial, but SSIS just does not have tools to handle this natively.

    Makes total sense. But that's not a question of SSIS breaking frequently. That's a tool it's missing.

    There are plenty of "cleaning up dirty data" tools that SSIS is missing or has implemented weakly. No arguing with that.

    All I was arguing with was "SSIS breaks frequently", since my experience with it belies that statement.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • We use SSIS to load a data warehouse as well as perform other database loads and it has been up to the task every time. Like others have mentioned, it can be a bit fragile and the error handling still isn't quite what it should be. Overall though, I think it is a huge improvement over DTS and has improved even more so since this editorial was first published. Sure it still lags behind tools like Informatica, but it is also a whole lot cheaper and easier to jump into without training.

  • Revenant (11/4/2011)


    I love BIDS, but I have a gripe: it is available only in VS2008, not in VS2010, and it forces me to keep both versions side by side.

    This is the main reason why I haven't dived into SSIS. I practically live in VS2010, where I am maintaining and developing multilple databases. Tasks that I might have implemented in SSIS, I implemented in T-SQL instead.

    Jay Bienvenu | http://bienv.com | http://twitter.com/jbnv

  • SSIS is a great tool , I work a lot with Informatica and Data Integrator but keep coming back to SSIS when possible due to the flexibility of it. They just need to create a I/U/D destination for SQL sever and it would be perfect..

    Using any ETL tool is by far the best way to load a DW, a lot of the problem DWs that I am brought into fix rely heavily on t-sql stored procs. to load data and these can be a nightmare to debug.

  • I also think that SSIS is a great tool, if you use it appropriately.

    The ease of use is sometimes the product's biggest downfall. I see too many people who use components (like the SCD component) and then scratch their heads when it performs poorly.

    In my opinion some of these components should never have been included in the product, because it encourages people to use it...especially if they don't know any better or are uncomfortable writing SQL code. But that's maybe a debate for another time...

    SSIS really shines when working with disparate sources, and for controlling workflow. I've done quite a few SSIS projects to integrate different systems, and it really works well.

  • Martin Schoombee (11/4/2011)


    The ease of use is sometimes the product's biggest downfall. I see too many people who use components (like the SCD component) and then scratch their heads when it performs poorly.

    In my opinion some of these components should never have been included in the product, because it encourages people to use it...especially if they don't know any better or are uncomfortable writing SQL code. But that's maybe a debate for another time...

    .

    couldn't agree more with statement concerning SCD, the transform they have included is awful and it should be a lot better or not there at all.

    In data integrator a full set-based SCD can be setup with about 4 clicks of the mouse compared to a lot of donkey work in SSIS, and its difficult explaining to clients why they shouldn't use it

    still rate it overall though

  • We use SSIS heavily and is the main tool for data transfer between system. As of this morning, we have 623 prodcution jobs running on various schedules and over 700 package executions a day. SSIS does not break, application code does.

    Yes, it is a different approach than the text book ETL methods, but once you learn how to use the tool correclty and desing jobs that are restartable, then it runs like a champ.

    Most of the issues we had in the early days, especially with security, were solved by using a custom build executable instead of the out-of-box DTExec.

  • I would love to see your benchmarks of custom .NET for integration vs SSIS

  • Mark Stacey (11/4/2011)


    I would love to see your benchmarks of custom .NET for integration vs SSIS

    Esp in an article. If you want to write it up, show a scenario and give time to build, run, issues, for one type of upload, it would be great.

  • We still have a lot of DTS, although we try and do anything new in SSIS. My impression is that although SSIS is more powerful, it is much harder to use. I could knock out DTS packages really quickly, but SSIS is always a trial. Getting it to do anything is almost like pulling teeth and in many ways it is much more restrictive.

    Microsoft can stick DTS under a "legacy" folder as much as they like but for simple jobs it was still a lot more time effective.

    They need to remember that people want a way of delivering robust functionality quickly and that checking and validation which most people dont need is just a frustration.

  • SSIS does more than DTS. You will appreciate these "hidden" features when you have to copy say 2B rows: data flow task will handle it, INSERT ... SELECT will likely overflow your log.

  • Raymond Lee-270096 (11/4/2011)


    Compared to DTS it is a huge step up. Compared to tools like Sagent, Informatica, and Ab Initio it is still a "better than nothing but not ready for prime time" tool. It is inexpensive but you get what you paid for. Hopefully it will continue to improve.

    There are reasons Cadillac and Chevy are different which is reflected in the price you pay, but we also know there are Chevy that can give you almost or better performance. In my day job people including my manager come up with a lot of things SSIS cannot do and I show them where it is defined and how it can be used. There were people in the Microsoft SSIS team that wanted to limit SSIS uses but Microsoft being an engineering company watched how the RDBMS vendor agnostics with .NET skills changed enterprise data with the product. So what you get from SSIS is based on skills, it is free so you need to pay for the skill that can use a combination of components to get what you need.

    Note to self load Denali packages in Asp.net REST pages almost Informatica cost zero dollars.

    Kind regards,
    Gift Peddie

Viewing 12 posts - 16 through 26 (of 26 total)

You must be logged in to reply to this topic. Login to reply