creating a data warehouse

  • My company is looking to create a new data warehouse - on a budget. A few years ago (better times) my company spent big bucks on DataStage 7.1 (Ascential - now IBM).

    Most of our transactional data is on a UniData system. We may need to draw data from Oracle, which is being used for a new application. The data warehouse will be using SQL Server.

    DataStage is OK, but there are issues - not a lot of people have experience with it, it's a bit "clunky" in a number of ways.

    Any opinions out there about DTS or something else as a solution? I'm not too exciting about the speed I've seen with DTS. And just because we have DataStage doesn't mean we have to use it.

    Thanks..

  • Hi Greg,

    Our warehouse uses DTS for ETL and we've been pretty happy with it.  We decided on it because we were already spending a hefty amount on business intelligence and name & address matching software.  Also, the consultants that we hired to design and implement the initial data warehouse had expertise in DTS programming.

    Our DW analysts say that DTS performance is okay for the amount of data that needs to be loaded.

    Greg

    Greg

  • Greg,

    DTS is fast or slow depending on the types of operations you are performing along with the size of data you are warehousing (not to mention the frequency you are moving the data).  As long as you keep your cursors to a minimum and keep your datatypes lean, DTS will do almost anything you need it to.

    If you want to really appreciate DTS (and even Analysis Server for that matter), take a crash course in SAP BW and you'll see what an incredibly expensive headache-waiting-to-happen looks like.  😉

    Chris

  • We also use DTS for our ETL solution and have found it is efficient for our needs. The price was right as well. if you are working on a budget, why add additional cost for an external ETL tool, when you get one with SQL Server? The money saved can be put towards a more beefy server, or disk space.

  • Greg,

    We also are looking to create a data warehouse, on a budget, against a Unidata database.  We also are looking at building the data warehouse in MS SQL Server.

    One of the big issues I am facing is, how to get the data out of Unidata into SQL Server?

    We currently are using the Safari ODBC driver (from ASG) to connect to the Unidata database, but since Unidata is multi-value, it is rather slow.  From what I have heard, though, the Safari ODBC driver is supposed to be better than the Unidata ODBC driver from IBM, as well the the ODBC driver from Crystal, since Safari has been working for five years with Datatel (the company that has the application with the Unidata back end that we use) on improving it.  It's decent, but I couldn't get it to work with DTS.  I was able to use MS Access as the "data copying tool" to copy the data from Unidata through Safari to MS SQL Server.  I am concerned that, as I add additional tables/fields to the data copy from Unidata in building the data warehouse, there eventually won't be enough time at night to copy all the data we want using the slow Safari ODBC connection.

    What method are you using to get the data out of Unidata?

    Dan

  • We're using IBM Ascential's DataStage 7.1, but it costs around $80K, and the number of skilled users is not large. We've already got the software (back in 2003) but the developers with the most experience have moved on, so it's just me now.

    I'm just saying that we really don't have to use it, just because the company bought it. It would be nice to have something like DTS or scripting to do the ETL because there are more resources for getting the job done and fixing problems.

  • Greg,

    I agree, it would be nice to be able to use something like DTS for the ETL.

    From my experience trying to extract data from Unidata, though, I have not been able to extract it with DTS.  That's why I am currently using Safari via a custom MS Access database to copy the data from Unidata into MS SQL Server.

    I'm considering using SSIS (2005 version of DTS) for the Transform and Load portions of the ETL, but do not know of a way to Extract directly from Unidata into MS SQL Server.

    If there is no or minimal cost to keep DataStage, perhaps you would be able to perform the extract in DTS using the DataStage connection to your Unidata database.  Or if it does not work to extract via DTS, perhaps you could Extract with DataStage to MS SQL server copies of the Unidata tables, then still perform your Transform and Load with DTS.  That is what I am currently considering as a solution for us (using Safari with Access for the extract, of course, instead of DataStage).

    If you do find a way to Extract the data from Unidata using DTS or SSIS, or even something substantially less expensive than DataStage, I would be *very* interested in hearing about it.

    Dan

  • My approach to DTS for warehouse ETL has been to use DTS for what it's good at - pulling data from disparate data formats and sources, workflow & scheduling.

    However, I try to get everyting into SQL staging tables as early as possible in the ETL process and do everything in set-based T-SQL stored procs from then on, with DTS in the background executing the procs in the correct order at the correct time.

    The advantages to this are:

    - Performance. A DTS datapump that is operating row-wise and using VBScript transforms is always going to get its ass kicked by set-based SQL

    - Source Code Control/Versioning. Ever tried to version/diff a DTS package ? Much easier with T-SQL stored procs scripted and put in VSS or other source code control tool

    - Training, staffing. More people know SQL than know the many weird & wonderful ways of doing things in DTS. Typical response from a T-SQL veteran presented with a complex DTS package: "Huh ? the Package is self-modifying ? You mean the package as coded changes itself at runtime to adapt to its environment ? How can I debug this in DEV, if it modifies itself in Production ? *runs back to cubicle and hides* "

     

     

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply