Inserting VBA Procedure into CSV Files Using SSIS

  • Hi to all you SSIS gurus out there. I am a complete newbie with SSIS and I need assistance with opening multiple .csv files in a folder and then adding a procedure to the file to change and add headers to transform the data for loading into our database. Can someone point me in a general way to do this? The procedure is not written but I am fairly fluent with Excel VBA and I would like to add a VBA procedure to each and every file. Is this possible?

    Regards:
    Mordred
    Keep on Coding in the Free World

  • I am not sure if this is what you're asking, but it sounds like you would like to add a VBA Macro to each csv file that will fire when each file is opened in Excel? Is that a correct assessment of your question?

    Stepping back for a second your question seems to be about standardizing the file layouts so they are all consistent. Is the end goal to simply load the files into a database, or rather to prepare them for use by end-users in Excel?

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Hi opc.three,

    First off, thanks for the reply and yes you are right, I want to add a VBA macro to each csv file that will fire when opened. And yes, I want to standardize file layouts to load them into a database. Our database adds a mnemonic (per user requirement) that does not exist with the original source files so I have to somehow transform the source data to match the table designs that I have.

    I know that I can use the for loop container but I am not familiar enough with this all yet. I was hoping to formally learn SSIS and BIDS but unfortunately the company I work for doesn't have the $$ in our department budget so I am on my own. I'm awaiting a book but I don't know how good it will be until I get it. I am eager to learn though.

    The macro itself I can write as I am very familiar with VBA but it's getting it into source files that will be deleted after load so I need the macro reusable for each time there is an update to the database.

    Regards:
    Mordred
    Keep on Coding in the Free World

  • ...

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Sorry, my last post was submitted prematurely...

    I am not aware of any way to implant a macro into a csv file. Into a xls or xlsx file sure, but not a csv. Even if it is possible I am not sure you want to go in that direction in this scenario. SSIS should be able to handle all your data processing needs. Besides, SSIS is typically run in an unattended fashion, e.g. as a job, and the use of server-side Office Automation on a server in an unattended context is frowned upon and even formally advised against by Microsoft.

    Your VBA programming skills may still be put to good use within the SSIS however, in the form of Script Tasks or Script Components. These objects help us extend SSIS when the built-in tasks and components simply won't do. I usually try to find a way to do things using a built-in solution, but sometimes the Script Task/Component is the only, and sometimes better option.

    Regarding the For Loop Container, sqlshare.com is a great site to get started learning SSIS. Seeing as SSIS is a mostly visual technology, guided tutorial-style videos seem to be a natural fit:

    SQL Share > For Loop Container

    SQL Share > SSIS search results

    If you post a sample file or two and more detailed requirements I can likely assist in pointing you in a good general direction when it comes to how you might break down the problem in terms of SSIS and accomplish the end goal.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Hi opc.three and thanks. I'm not sure which file you would want me to upload but I am assuming it is the .csv so here it is. Bare in mind that I am in the infant stages of this system I am designing and I don't know if everything is right regarding my tables.

    I am still a student and my experience with this is still minimal. All I have seen as far as database design goes are databases for students, courses, faculties, and the likes for university. Now that I am in the real world I find the transition from what I have learnt to real world practices is not that easy.

    I have uploaded a word document that has my table design on it and I also have the .csv source file for you to look at. There are still parts of this that I have to figure out with the user but so far this is where I'm at.

    EDIT: I cannot upload from work for some reason as I keep getting timed out. Below is a sort of description of what's going on but to eyes that aren't working on what I am it may be difficult to get, sorry if that's the case. At any rate, I have emailed my personal email with the files and should have no problems uploading the files when I get home from work tonight.

    From the .csv file to tblEconomicJoin:

    | <-- Indicates the split between csv and the table

    - |Source_Mnemonic (SC for Statistics Canada)

    08/03 - 08/06 |Season_Mnemonic (SP for Spring)

    csv-value | Table Heading

    Ref_Date | Months

    GEO | Region_Mnemonics (CAN for Canada)

    COMPONENTS | Variable_Mnemonic (TG for Total Generation)

    COMPONENTS | Variable_vSeries (v44158996 for Canadian Total Generation)

    Value | Variable_Data

    Regards:
    Mordred
    Keep on Coding in the Free World

  • What I would really like to see is a before and after CSV file.

    The before file being what you receive from your external data supplier.

    The after file being what you envisioned the file looking like after you did what you described in your original post as change and add headers to transform the data for loading into our database.

    Essentially I want to know what you had planned to implement in VBA to get your files ready for import into your database, by example via the before and after CSV files. From there I can help you determine how to use SSIS to transform the before file and load it into your database.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • I will get that to you as promptly as I can.

    Regards:
    Mordred
    Keep on Coding in the Free World

  • How is it going?

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Sorry for such a late response, I've been working at learning SSIS more and have created a data flow task to do my editing for me. My dft contains a flat file source, a data conversion, and the OLE DB destination. I am having a difficult time with actually loading one table with this task and have received a lengthy error report:

    SSIS package "Package.dtsx" starting.

    Information: 0x4004300A at Data Flow Task, SSIS.Pipeline: Validation phase is beginning.

    Information: 0x4004300A at Data Flow Task, SSIS.Pipeline: Validation phase is beginning.

    Information: 0x40043006 at Data Flow Task, SSIS.Pipeline: Prepare for Execute phase is beginning.

    Information: 0x40043007 at Data Flow Task, SSIS.Pipeline: Pre-Execute phase is beginning.

    Information: 0x402090DC at Data Flow Task, Flat File Source [1]: The processing of file "C:\SQLCSVFiles\Tables_And_Links.csv" has started.

    Information: 0x4004300C at Data Flow Task, SSIS.Pipeline: Execute phase is beginning.

    Error: 0xC02020A1 at Data Flow Task, Flat File Source [1]: Data conversion failed. The data conversion for column "Table_Links" returned status value 4 and status text "Text was truncated or one or more characters had no match in the target code page.".

    Error: 0xC020902A at Data Flow Task, Flat File Source [1]: The "output column "Table_Links" (41)" failed because truncation occurred, and the truncation row disposition on "output column "Table_Links" (41)" specifies failure on truncation. A truncation error occurred on the specified object of the specified component.

    Error: 0xC0202092 at Data Flow Task, Flat File Source [1]: An error occurred while processing file "C:\SQLCSVFiles\Tables_And_Links.csv" on data row 2.

    Error: 0xC0047038 at Data Flow Task, SSIS.Pipeline: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED. The PrimeOutput method on component "Flat File Source" (1) returned error code 0xC0202092. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing. There may be error messages posted before this with more information about the failure.

    Information: 0x40043008 at Data Flow Task, SSIS.Pipeline: Post Execute phase is beginning.

    Information: 0x402090DD at Data Flow Task, Flat File Source [1]: The processing of file "C:\SQLCSVFiles\Tables_And_Links.csv" has ended.

    Information: 0x4004300B at Data Flow Task, SSIS.Pipeline: "component "OLE DB Destination" (15)" wrote 0 rows.

    Information: 0x40043009 at Data Flow Task, SSIS.Pipeline: Cleanup phase is beginning.

    Task failed: Data Flow Task

    Warning: 0x80019002 at Package: SSIS Warning Code DTS_W_MAXIMUMERRORCOUNTREACHED. The Execution method succeeded, but the number of errors raised (4) reached the maximum allowed (1); resulting in failure. This occurs when the number of errors reaches the number specified in MaximumErrorCount. Change the MaximumErrorCount or fix the errors.

    SSIS package "Package.dtsx" finished: Failure.

    From what I can tell, SQL Server is not liking the strings (which are URLs) that are being passed and are being truncated but failing. From some of the interweb reading I have done, one possible reason may be that SQL Server is not sure what to do with one or more characters that are in a URL. I don't know for sure though and the other errors I'm not sure how to fix. All my data types match from source to destination via the data conversion so I am not understanding why I should be having issues with what should be an easy process. I'm stumped a bit but am determined.

    Regards:
    Mordred
    Keep on Coding in the Free World

  • Alright, I think the issue might be that I have some duplicate values in the csv file trying to be passed into the PK fields. Is the csv file still needed in order to figure out how to remove duplicates using SSIS & BIDS? I think my next task will be to figure out how to remove duplicates from the source if they exist.

    Regards:
    Mordred
    Keep on Coding in the Free World

  • Mordred (7/28/2012)


    Alright, I think the issue might be that I have some duplicate values in the csv file trying to be passed into the PK fields. Is the csv file still needed in order to figure out how to remove duplicates using SSIS & BIDS? I think my next task will be to figure out how to remove duplicates from the source if they exist.

    If your source file is delimited then chances are SSIS assigned too short of a length to one or more of the file fields. Look in the Flat File Connection Manager Properties for the Advanced page and make sure each field is set to a wide-enough length to accommodate your data.

    Regarding duplicates, a couple options for you:

    1. load the file into a staging table first, i.e. a table with no PK. Then issue a T-SQL INSERT...SELECT with a GROUP BY or a DISTINCT to load your final table from your staging table without duplicates.

    2. If your requirements are such that you must load directly into a table with a PK then you can use the SSIS Sort Transformation to remove duplicates prior to loading the table.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • I really appreciate your expertise on this matter. I am going to give your first suggestion a go as I think that it will be a helpful task in the future with other csv files. I'll let you know how it goes after I play around with your ideas a bit. Again, thank you.

    Regards:
    Mordred
    Keep on Coding in the Free World

  • If your source file is delimited then chances are SSIS assigned too short of a length to one or more of the file fields. Look in the Flat File Connection Manager Properties for the Advanced page and make sure each field is set to a wide-enough length to accommodate your data.

    That was bang on. Thank you so much for your assistance here. I'm going to now make a staging table to protect me against dupes and all should be good. Again, thanks a lot, you've saved my sanity.

    Regards:
    Mordred
    Keep on Coding in the Free World

  • You're welcome.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply