SSIS and dealing with embedded double quotes

  • Phil Parkin (1/7/2015)


    mrtrich99 (3/27/2013)


    Useful suggestion, but I do hope the OP solved his problem 2 years ago 😉

    just like most people suggested, i now request for pipe delimiters, but sometimes I do still get files that are comma delimiter and the user extracting the data doesn't know how to change the delimiter. in those cases I have no choice but to suck it up and clean the data up manually.

    Can you provide some sample failing data and a copy of the error message?

    As I said before, I hoped he solved his issue 2 years ago 😉

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Koen Verbeeck (1/7/2015)


    Phil Parkin (1/7/2015)


    mrtrich99 (3/27/2013)


    Useful suggestion, but I do hope the OP solved his problem 2 years ago 😉

    just like most people suggested, i now request for pipe delimiters, but sometimes I do still get files that are comma delimiter and the user extracting the data doesn't know how to change the delimiter. in those cases I have no choice but to suck it up and clean the data up manually.

    Can you provide some sample failing data and a copy of the error message?

    As I said before, I hoped he solved his issue 2 years ago 😉

    Damn it, I quoted the wrong post!! 🙂

    I meant to quote this one, from today:

    running into this same issue. I read that 2012 could handle column dilimiters embedded in a text however It's not working for me. When i preview the data it woks fine however fails when trying to bring down the file. Also, I'm consuming the files and they are being generated by an ad server. So i cant specify what the delimiter will be. Not exactly sure how to proceed other than to use TALEND. I cant remove them from the data because its the "name" column and if it has double quotes in the name then so be it. This is really starting to make me angry.

    If i end up having to convert this ssis to talend then i'm going to use talend going forward w/ the intent of moving away from ssis totally.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Closing the loop here. I had to knock the dust off my c# to create the below. however it worked for me.

    The problem it solves: The issue is i have .csv text qualified files. in these files on occasion i'd get a doule quote in the data itself and ssis who chock.

    example below

    1000000,213452,"hello guys","This is the string that "would break ssis","goodbye world"

    The double quote in front of the word "would" is whats killing it. That should be two double quotes and it would treat it as one and process correctly.

    The below visual studio 2010 c# code run as an .exe will parce the folder it is stored in and in this case loop through every file that has the string "18_CREATIVE_" and "CSV" in it's name. It will then do several replaces to find and replace all single double quote in the data and replace it w/ two double quotes. Lastly it calls a proc passing in a string that has the filenames it went through. This proc sends an email to my team specifying what files were processed. This portion is commented out however but u get the picture. Hope this help the next poor sap who stumbles upon this issue

    using System;

    using System.IO;

    using System.Collections.Generic;

    using System.Linq;

    using System.Text;

    using System.Windows.Forms;

    using System.Data;

    using System.Data.SqlClient;

    namespace TextQualifierInDataClenser

    {

    class Program

    {

    static void Main(string[] args)

    {

    string OutputEmailBody = "";

    int sendMail = 0;

    string CurrDir = Directory.GetCurrentDirectory();

    Console.WriteLine(CurrDir);

    foreach (var file in System.IO.Directory.GetFiles(CurrDir))

    {

    int LOCAL = 0;

    int isCreative = 0;

    string uFile = file.ToUpper();

    LOCAL = uFile.IndexOf(".CSV");

    isCreative = uFile.IndexOf("18_CREATIVE_");

    if (LOCAL > 0 && isCreative > 0)

    {

    try

    {

    string contents = File.ReadAllText(file);

    contents = contents.Replace("\",\"", "~~~");

    contents = contents.Replace(",\"", "RRR~~~");

    contents = contents.Replace("\",", "LLL~~~");

    contents = contents.Replace("\"\r", "EEE~~~");

    contents = contents.Replace("\r\"", "BBB~~~");

    contents = contents.Replace("\"\"", "\"");

    contents = contents.Replace("\"", "\"\"");

    contents = contents.Replace("BBB~~~", "\r\"");

    contents = contents.Replace("EEE~~~", "\"\r");

    contents = contents.Replace("LLL~~~", "\",");

    contents = contents.Replace("RRR~~~", ",\"");

    contents = contents.Replace("~~~", "\",\"");

    File.WriteAllText(file, contents.ToString());

    OutputEmailBody = OutputEmailBody + file + "\r";

    sendMail = 1;

    }

    // catch ONLY those exceptions you expect

    catch (System.ArgumentException e)

    {

    // ALWAYS log an error, expected or otherwise.

    Console.WriteLine("error", e);

    }

    }

    }

    //i created a stored prod that took in a string and the string in this case specified whick files were clensed and emails it to my team. However this is not needed

    if (sendMail > 0)

    {

    //code to call proc that sends email

    /* using (var conn = new SqlConnection("Server=XXXXXX;DataBase=XXXXXX;Integrated Security=SSPI"))

    using (var command = new SqlCommand("Stored Procedure Name", conn)

    {

    CommandType = CommandType.StoredProcedure

    })

    {

    OutputEmailBody = "The below Files were adjusted by the TextQualifierInDataClenser. You may want to look them over and ensure that nothing out of the norm was adjusted. Should just be fixing the single double quote in the data" + "\r" + "\r" + OutputEmailBody;

    conn.Open();

    command.Parameters.Add(new SqlParameter("@bodyOut", OutputEmailBody));

    command.ExecuteNonQuery();

    conn.Close();

    }*/

    }

    //these were for testing. the write to console tells me what was processed and the messageBox was used to stop the process from completing w/o me having to click the message box.

    // Console.WriteLine(OutputEmailBody);

    //MessageBox.Show("ff");

    }

    }

    }

Viewing 3 posts - 31 through 32 (of 32 total)

You must be logged in to reply to this topic. Login to reply