SSIS: Dynamically map metadata in a Data Flow Task

  • There is no in-built functionality to dynamically remap inputs and outputs in SSIS. the validation of the meta-data is an integral part of the package validation so source and destination structures have to be defined in the XML.

    you can put multiple dataflows inside one master package (I think this has already been mentioned) but from a maintenance perspective it is probably easier to have have each table in its own package and have the FEL Loop pick up a tablename, childpackage name and execute the correct child package; this way new tables do not require a change to the controller package.

    If you need to create a number of packages that do the same ETL but for different tables, you might want to look at BIML. This is a scripting language add-on for BIDS/SSDT that allows you to generate the DF from the source and destination table metadata. If you only need to create 25 DFs it is probably overkill, but if you have 2500 then the time spent would be well worth it. It would probably take you between 100 and 200 hours to get your head around BIML and generate sensible packages, but it would then build all 2500 packages in less than an hour (and every time the metadata changes, or the audit requirements are changed, you modify the BIML script and regenerate the packages - Imagine having to add a script component by hand to 2500 ETL routines!)

    BIML has some limitations: build and version numbers are not incremented and the GUID for the package changes each time so putting the packages into source code control has issues. In reality what you want to SCC is the BIML script, but I have not seen a solution for annotating the package with the BIML script version so there is no way to know which version of the script generated the package. That said, it may be a good solution to get you to prototype.

  • aaron.reese (11/27/2014)


    BIML has some limitations: build and version numbers are not incremented and the GUID for the package changes each time so putting the packages into source code control has issues. In reality what you want to SCC is the BIML script, but I have not seen a solution for annotating the package with the BIML script version so there is no way to know which version of the script generated the package. That said, it may be a good solution to get you to prototype.

    I have not encountered issues with BIML generated packages and source control, to be honest.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • @Koen,

    Yes, you can check the dtsx packages and their project into source control but the cool things that SCC gives you are negated by the way that BIML builds it's packages

    How do you do a diff compare between the two packages. If you generate the same package twice from the same script, the internals, esp GUIDs for tasks are changed so SCC will see them as different even though they are functionally the same. Also the version and build numbers are set to 1 so even if you change the script, the two different packages will have the same version number, and how do you know which SCC 'version' of a script was used to generate a specific copy of a dtsx package.

    Conceptually, the BIML script is the source and the dtsx is the compiled object and you would not normally SCC the compiled object.

    Have you found a way to get BIML to look at a build history table and inject the previously generated GUIDs and increment the version numbers? If so, I would be VERY interested as this is the one thing that holds me back from generating production packages using BIML.

  • aaron.reese (11/27/2014)


    @Koen,

    Yes, you can check the dtsx packages and their project into source control but the cool things that SCC gives you are negated by the way that BIML builds it's packages

    How do you do a diff compare between the two packages. If you generate the same package twice from the same script, the internals, esp GUIDs for tasks are changed so SCC will see them as different even though they are functionally the same. Also the version and build numbers are set to 1 so even if you change the script, the two different packages will have the same version number, and how do you know which SCC 'version' of a script was used to generate a specific copy of a dtsx package.

    Conceptually, the BIML script is the source and the dtsx is the compiled object and you would not normally SCC the compiled object.

    Have you found a way to get BIML to look at a build history table and inject the previously generated GUIDs and increment the version numbers? If so, I would be VERY interested as this is the one thing that holds me back from generating production packages using BIML.

    Good lord, I never do a diff compare of SSIS packages 😀

    The tiniest change in layout already can give great changes in the XML.

    Without BIML, do you manually change the version number?

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Without BIML, do you manually change the version number?

    No, not unless there are significant changes to the package; however the BUILD number autoincrements with every build in SSIS so you can check whether the production copy is the same as the dev/test/uat copy. Packages generated with BIML are always build #1 so you have lost your basic check.

    I agree about layout changes having drastic effect on the XML; however you can use the diff compare to pick up 'simple' changes like a change to the datatype or the build#.

    Sorry to all for hijacking this thread, Koen, should be start a new one about BIML?

    Short of that, I would like to see some sort of dependency injection when compiling packages. E.g. if I have a 'standard' audit logger then I can stub this out and inject it into all the packages that use it. In the event that it changes, all I need to do is rebuild those packages that use that stub and the injected code is updated. Pragmatic works have a task that 'sort of' does that but only in the dataflow task.

    I would also like to see template inheritance so that packages that are generated with a particular template can be updated by updating the template.

  • aaron.reese (11/27/2014)


    I would also like to see template inheritance so that packages that are generated with a particular template can be updated by updating the template.

    That would be totally awesome.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • aaron.reese (11/27/2014)

    Sorry to all for hijacking this thread, Koen, should be start a new one about BIML?

    Wouldnt it be better to have an article or Stairway series on BIML as its one of the things I'd love to get my teeth into in more detail but haven't seen much in the way of subject matter on the topic.

    Are there any books out there (MS press or other), that delve into this area?

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • Jason-299789 (12/1/2014)


    aaron.reese (11/27/2014)

    Sorry to all for hijacking this thread, Koen, should be start a new one about BIML?

    Wouldnt it be better to have an article or Stairway series on BIML as its one of the things I'd love to get my teeth into in more detail but haven't seen much in the way of subject matter on the topic.

    Are there any books out there (MS press or other), that delve into this area?

    You mean this stairway[/url]? 😉

    No books yet unfortunately.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Koen Verbeeck (12/1/2014)


    Jason-299789 (12/1/2014)


    aaron.reese (11/27/2014)

    Sorry to all for hijacking this thread, Koen, should be start a new one about BIML?

    Wouldnt it be better to have an article or Stairway series on BIML as its one of the things I'd love to get my teeth into in more detail but haven't seen much in the way of subject matter on the topic.

    Are there any books out there (MS press or other), that delve into this area?

    You mean this stairway[/url]? 😉

    No books yet unfortunately.

    Thanks Koen, :w00t:

    I didn't know SSC already had a stairway series, I really should have search on the subject.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • I believe that all the standard components in ssis are .NET classes so you could always write your own c# or vb.net to generate packages but it would be a really steep learning curve!

  • aaron.reese (12/1/2014)


    I believe that all the standard components in ssis are .NET classes so you could always write your own c# or vb.net to generate packages but it would be a really steep learning curve!

    I posted a solution that did that (see page 1), but it required the source and destination table to have the same column names, but BIML would make it more transparent, and also act as a base Source to Target mapping document.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

Viewing 11 posts - 16 through 25 (of 25 total)

You must be logged in to reply to this topic. Login to reply