Eliminate carriage return in data

  • Hi,

    I'm trying to import a text file and have notice that on the first record, there is an extra carriage return after the second data element and I would like to know how I can get rid of this using SQL Server within a DTS package.  My intent is to load this text into a table on a daily basis but first I must get rid of that extra carriage return.

    For example:

    "FirstName" <Tab> "LastName"<Tab>"CompanyName"

    "John"<Tab>"Doe"<CR>

    "ACME"

    I need the data to read

    "FirstName" <Tab> "LastName"<Tab>"CompanyName"

    "John"<Tab>"Doe"<Tab>"ACME"

    Thanks,

     

  • Is there more than one row of data so it looks like this?

     

    "FirstName" <Tab> "LastName"<Tab>"CompanyName"

    "John"<Tab>"Doe"<CR>

    "ACME"<CR>

    "John"<Tab>"Doe"<CR>

    "ACME"<CR>


  • Actually an average file contains at least 500 records, however only the first record with data is the issue.  I've analyze at least five files with this situation.

  • So it looks like this?

    "FirstName" <Tab> "LastName"<Tab>"CompanyName"<CR>

    "John"<Tab>"Doe"<CR>

    "ACME"<CR>

    "John"<Tab>"Doe"<Tab>"ACME"<CR>

    "John"<Tab>"Doe"<Tab>"ACME"<CR>

    "John"<Tab>"Doe"<Tab>"ACME"<CR>

    "John"<Tab>"Doe"<Tab>"ACME"<CR>

    "John"<Tab>"Doe"<Tab>"ACME"<CR>


  • Yes,

    I guess my question becomes do I clean these descrp. in SQL Server or C#?

    Thanks,

  • I would use an ActiveX script as long as you know that you are only repairing the 2nd and 3rd line.

    Read line one(header) and write to new text file

    Read line 2 and write to variable

    Read line 3 and write to variable

    Replace <CR> with <TAB> in variable

    write variable to the new text file

    loop through the rest of the records and write to the new text file.

    import the new fixed file.

     

    I would write it for you but I use vbscript and you mentioned using C#.  Let me know if you want it in vbscript.


  • Wow it's been yrs since I've done anything in VBScript(ActiveX).  You're saying within the DTS package you would load the text file then check the text file with the ActiveX script then load into either a temp table or source table?

    I would appreciate that very much!!

    Thanks,

  • If I had my druthers I would tell whoever is creating the source data to fix it but I guess that isn't an option for you.  Does your source file name change or is it the same every time?


  • Unfortunately I don't have a say in how this file is formatted.  The file is downloaded by the users and they manipulate the file in Excel and then create a series of Excel files for their analysis.  Needless to say this eats up about 2-3 hours daily.

    What I had proposed is the following (working out the system flow):

    1. Download the file to a public directory (I have no access to this company's APIs)

    2. Using a utility, copy the day's file to another directory always overriding the file(I'll always have the original file)

    3. Use a DTS package to load the daily overridden file into a temp table or source table and build the necessary balance and exception reporting via a sproc

    4. Report via Cognos

     

    In step two I was wondering on doing file manipulation within the C# utility I'm building and therefore all the DTS package would just do is read the txt file and loaded it.

    What are your thoughts.  I always kind of thought file manipulation should be done outside the database.

    Thanks,

  • Here is the script to repair line 1 and 2.  I use DTS to do all sorts of file manipulation including downloading,copying, and importing.

    Are you downloading via ftp? 

    ' start script here

    dim fso,fl,nfl,lvar,line

    set fso=createobject("scripting.filesystemobject")

    set nfl = fso.createtextfile("newfile.txt")

    set fl= fso.OpenTextFile("junkfile.txt")

    nfl.writeline fl.readline 'write the header

    lvar=fl.readline 'read line 2

    lvar=lvar & vbtab & fl.readline ' append <TAB> and line 3

    nfl.writeline lvar ' write the new line

    do while not fl.atendofstream 'loop through the rest of the file

     nfl.writeline fl.readline

    loop

    fl.close

    nfl.close

    set fso = nothing

    'end script here


  • First of all thanks for the code.  To answer your question, no.  I'm not using http://FTP.&nbsp; The user downloads the file from this company's website and stores the file onto a network drive.  That is my starting point.

    FYI the first row does contain the column names, it's just the first data row has the extra carriage return.

    Thanks,

  • If the filename is standardized you can use the same ActiveX script to pick up the file from the public directory, fix the issue and then use a data transformation task to import into SQL.  That way you could skip the extra C# utility.  You can also use a vbscipt with msxml Control to grab the file from the web site. That way the whole process is packaged in the DTS.


  •  For CR-LF removal, I do something like this:

    update

    My_Table

    set

    ColumnA = replace(replace(ColumnA , char(13), ''), char(10), '')

    where

    charindex(char(10), ColumnA ) > 0 or charindex(char(13), ColumnA ) > 0

    Note: char(10) is LF and char(13) is CR

  • Problem is that this is a text file and since the number of fields will be wrong for the first 2 lines of data it should be corrected at the text level before trying to import it. IMHO


  • Thanks for the quick respond.  I also need to evaluate my design because I've been told that this file is going away and we'll be using some type of service to get at this data.  I need to build a short-term and long-term solution (my initial design proposal) hence why the C# utility (which is becoming more of a framework).

    I'm not sure how I want to tackle this design.  In the meantime I feel I can use the suggestions everyone has provided.

    Thanks all.  I'll keep everyone posted.

    -- Joe

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply