Importing fixed width files with varying row types

  • Hi All,

    I have a question regarding the capabilities of SSIS. I have a file which i need to import into SQL, the file is made up of a header row and then the transactions for that header, then another header row followed by its transactions etc. The column widths are different for the header and detail rows. Is it possible to configure SSIS so that it uses 2 different imports for the same file? I have an identifier at the start of the header strings to tell me its a header row and not a detail row.

    Start of file example below...

    SHR1,001019155,9876,070606,070605,0000001234567,GIRO PLC - TEST DATA - DO NOT DELETE - TEST DATA - DUMMY ACCOUNT DATA ,TEST DATA

    001019155,BG100,N,50901701,001000002,00000000,0,000000, , , , ,73401

    001019155,BG120,N,50901702,001000002,00000000,0,000000, , , , ,73402

    Thanks in advance

    John

  • You can do this with SSIS, but you'll need to run it through a script component to make it work. Depending on the destination of the rows, you may need to use multiple outputs on your script component as well.

    Post back here if you need more guidance.

    hth,

    Tim

    Tim Mitchell, Microsoft Data Platform MVP
    Data Warehouse and ETL Consultant
    TimMitchell.net | @Tim_Mitchell | Tyleris.com
    ETL Best Practices

  • Thanks for your quick reply Tim, do you have any links on a simple file import tutorial using the script task? There will ideally be two outputs, one for the header records and one for the transaction details.

  • The solution I used for a similar issue required that the detail rows contain a header number. In your sample data, it looks like the header row number is the first item in the detail row. My solution worked like this:

    I read the entire row, then check to see if it is a header by using a derived column to choose only that section of the string. A conditional split on that derived field is used to send header rows into one path and detail rows into a different path for additional processing.

    Since the structure of the two is different, I assume the records go to different tables, so a separate destination control would be used for each.

  • I would go with Ben's suggestion, though splitting the "rowstring" to fields by using derived columns can be quite timeconsuming (and boring :doze: ). However, you're pretty sure you get what you bargained for.

    A different approach would be to read the same file twice, with different columndefinitions on the datasource. By discarding rows that don't match the definition, you would end up with the two datasets you need, without a lot of manual labor to define the columns. It's sure not elegant, and poses the risk of losing records without knowing it, but depending on your requirements this might be a Q&D solution :Whistling:

    Peter Rijs
    BI Consultant, The Netherlands

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply