Importing user data from Excel/CSV

  • Hi,
    I have a growing need to import recurring data from Excel  - it needs to be treated like production data !!!  This gives me horrors to be honest.
    Users tend to think Excel is like a data base unfortunately.
    We tend to keep control of the Excel imports i.e. converting to CSV, importing via import task in SSMS (just dump in the data initially to staging table) We then do the data cleaning with TSQL.
    I wondered if it's possible to automate this in a way that puts the onus on the user providing the workbook in the agreed state.  I appreciate SSIS can't really watch folders.

    Thinking along the lines of :

    1. Users drops in a CSV file to a folder,
    2. SSIS job polls folder every 30 mins.  
    3. C# to check for file and validate specification
    4. Import to data once validated
    5. If failing give enough information back to the user to help them fix the issue(via email possibly)
    Some of the errors I want to catch are related to users putting multiple types in a column or changing the column positions.
    Can anyone suggest an alternative approach?
    Is there a custom component that provides this functionality?
    Best

    Lee

  • leehbi - Monday, June 5, 2017 4:54 AM

    Thinking along the lines of :

    1. Users drops in a CSV file to a folder,
    2. SSIS job polls folder every 30 mins.  
    3. C# to check for file and validate specification
    4. Import to data once validated
    5. If failing give enough information back to the user to help them fix the issue(via email possibly)
    Some of the errors I want to catch are related to users putting multiple types in a column or changing the column positions.
    Can anyone suggest an alternative approach?
    Is there a custom component that provides this functionality?

    Your process sounds okay, but step 3 seems to be the problematic step to me.  You can fairly easily check the header to see has the correct number of fields are there.

    1) What if the user embeds a delimiter (tab, comma, double quote... whatever you are using as a delimiter) within a text field in the Excel document?  This will cause a flat file source to choke in your SSIS import process.

    2) What if the user puts bogus data into a field you're assuming is datetime (e.g. 02/31/2017) -- yes, you can work around this.

    What if you just ran your import process and have that kick of the notification email if/when it fails.  Import to a seperate staging table so bad data/half loaded data doesn't mess up other things.  Just a thought.

    CozyRoc has a data source that may be more flexible than the standard flat file source.  I've never used it, so I don't know if it would help your situation.
    http://www.cozyroc.com/ssis/dynamic-data-flow

    Good luck,
    Rob

  • Thanks Rob - good food for thought.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply