Error when loading CSV file to SQL Server using SSIS. The Column Delimiter was not found

  • Can someone help me with loading the attached csv file database.
    The issue here is the CSV file has data with Double quotes and comma seperated values but there are some instances where one of the column has Doule quotes in the data also.(Show with Bold)

    ---FYI.. This is a single record in my CSV File....
    "33043000030000","24","MAGNOLIA PETROLEUM CO.","NORTH DAKOTA STATE A  1","NORTH DAKOTA STATE A","1","MAGNOLIA PETROLEUM CO.","NORTH DAKOTA STATE "A" #1","10/21/1950","5609","KIDDER","141 N","73 W","36","SENE","1980 FNL 810 FEL","WILDCAT","","","","VERTICAL","46.990158000000001","-99.864013","OG","DRY","","12/24/1950"

    I read online and people are suggesting me to write VB code.
    Can somone help me on this.
    Thank you.

  • kashyap4007 - Tuesday, August 28, 2018 6:11 PM

    Can someone help me with loading the attached csv file database.
    The issue here is the CSV file has data with Double quotes and comma seperated values but there are some instances where one of the column has Doule quotes in the data also.(Show with Bold)

    ---FYI.. This is a single record in my CSV File....
    "33043000030000","24","MAGNOLIA PETROLEUM CO.","NORTH DAKOTA STATE A  1","NORTH DAKOTA STATE A","1","MAGNOLIA PETROLEUM CO.","NORTH DAKOTA STATE "A" #1","10/21/1950","5609","KIDDER","141 N","73 W","36","SENE","1980 FNL 810 FEL","WILDCAT","","","","VERTICAL","46.990158000000001","-99.864013","OG","DRY","","12/24/1950"

    I read online and people are suggesting me to write VB code.
    Can somone help me on this.
    Thank you.

    Whoever provided the data didn't follow the rules for CSV and double up on the quotes that were embedded in a field.  And people suggesting the VB route apparently don't actually have a clue either because VB will split the line as if it had 2 extra fields, as well.

    Are all the lines screwed up like this?  If they're consistent, then no need for VB or anything else crazy.  But ALL the lines need to have the same very consistent mistake and then we can salvage the data pretty easily. 

    So is the mistake consistent on every line or not?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Thank you Guru for the quick reply.
    I am attaching the sample data here.
    To answer your question, this is not consistent.

  • You've still not attached any data but... I'm not sure that you should.  Is there any private or sensitive information in the file?  And "guru" is just the level of my participation, not my name or handle.  My name is Jeff Moden... Look above people's avatars for their names.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff,
    This is public data which i got from website so I think i can share.
    Below is 2 more records.

    3.31E+13 41 AMERADA HESS CORPORATION KERMIT HALVERSON     1 KERMIT HALVERSON 1 AMERADA PETROLEUM CORP. KERMIT HALVERSON #1 11/30/1951 9705 MCKENZIE 152 N 96 W 26 NESE 1830 FSL  660 FEL CLEAR CREEK VERTICAL 47.953663 -102.921996 OG DRY 4/22/1952
    3.31E+13 42 AMERADA HESS CORPORATION BEAVER LODGE-MADISON UNIT     V-26 BEAVER LODGE-MADISON UNIT V-26 AMERADA PETROLEUM CORP. N. DAKOTA STATE A" #1" 12/11/1951 8595 WILLIAMS 156 N 95 W 16 NESE 1980 FSL  660 FEL BEAVER LODGE MADISON 911269|459128|0 02/11/1952|521|18|641 VERTICAL 48.332938 -102.895553 OG PA 8/3/1984
  • The file has been built incorrectly.
    No parsing tool can fix it.
    Must be corrected at the source.

    "NORTH DAKOTA STATE A"",""1",
    "NORTH DAKOTA STATE ""A"" #1",
    and so on.

    _____________
    Code for TallyGenerator

  • kashyap4007 - Tuesday, August 28, 2018 8:37 PM

    Jeff,
    This is public data which i got from website so I think i can share.
    Below is 2 more records.

    3.31E+13 41 AMERADA HESS CORPORATION KERMIT HALVERSON     1 KERMIT HALVERSON 1 AMERADA PETROLEUM CORP. KERMIT HALVERSON #1 11/30/1951 9705 MCKENZIE 152 N 96 W 26 NESE 1830 FSL  660 FEL CLEAR CREEK VERTICAL 47.953663 -102.921996 OG DRY 4/22/1952
    3.31E+13 42 AMERADA HESS CORPORATION BEAVER LODGE-MADISON UNIT     V-26 BEAVER LODGE-MADISON UNIT V-26 AMERADA PETROLEUM CORP. N. DAKOTA STATE A" #1" 12/11/1951 8595 WILLIAMS 156 N 95 W 16 NESE 1980 FSL  660 FEL BEAVER LODGE MADISON 911269|459128|0 02/11/1952|521|18|641 VERTICAL 48.332938 -102.895553 OG PA 8/3/1984

    You seem to have left the delimiters out.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • kashyap4007 - Tuesday, August 28, 2018 8:37 PM

    Jeff,
    This is public data which i got from website so I think i can share.
    Below is 2 more records.

    3.31E+13 41 AMERADA HESS CORPORATION KERMIT HALVERSON     1 KERMIT HALVERSON 1 AMERADA PETROLEUM CORP. KERMIT HALVERSON #1 11/30/1951 9705 MCKENZIE 152 N 96 W 26 NESE 1830 FSL  660 FEL CLEAR CREEK VERTICAL 47.953663 -102.921996 OG DRY 4/22/1952
    3.31E+13 42 AMERADA HESS CORPORATION BEAVER LODGE-MADISON UNIT     V-26 BEAVER LODGE-MADISON UNIT V-26 AMERADA PETROLEUM CORP. N. DAKOTA STATE A" #1" 12/11/1951 8595 WILLIAMS 156 N 95 W 16 NESE 1980 FSL  660 FEL BEAVER LODGE MADISON 911269|459128|0 02/11/1952|521|18|641 VERTICAL 48.332938 -102.895553 OG PA 8/3/1984

    Whoever was saving the data in the CSV file made a rookie mistake of doing
     + '"' + strValue + '"'
    instead of 
    + QUOTENAME(strValue, '"') 
    or, if strValue can be possibly longer than 128 characters,
    + '"' + REPLACE(strValue, '"', '""') + '"'

    Put these 2 rows into Excel and save the file as CSV.
    You'll see how it should be done correctly.

    _____________
    Code for TallyGenerator

  • Sergiy - Tuesday, August 28, 2018 8:44 PM

    The file has been built incorrectly.
    No parsing tool can fix it.
    Must be corrected at the source.

    "NORTH DAKOTA STATE A"",""1",
    "NORTH DAKOTA STATE ""A"" #1",
    and so on.

    If the data is inconsistent in it's mistake, then I absolutely agree.  This is a rookie mistake on the part of whomever assembles the data and it's easily fixed at the source.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Is this a one off, or will this be regularly scheduled load?
    Try specifying just the comma delimiter in SSIS, load into a staging table and write a stored procedure to clean it up and load to the target table.

  • Joe Torre - Wednesday, August 29, 2018 11:45 AM

    Is this a one off, or will this be regularly scheduled load?
    Try specifying just the comma delimiter in SSIS, load into a staging table and write a stored procedure to clean it up and load to the target table.

    Better yet, do a BULK INSERT and sequester the bad rows as they occur and fix them separately.  There are "switches" in BULK INSERT that will allow that.  I imagine that SSIS can also do similar.  That would make it so only the "problem" rows would need any form of rework.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Thank you all for the help.
    This is how I was able to resolve the issue.
    I did set the TextQualified to False on the columns where the issue is happening and used a Derived Column Transformation to remove the surrounding """" data.

  • kashyap4007 - Wednesday, August 29, 2018 12:58 PM

    Thank you all for the help.
    This is how I was able to resolve the issue.
    I did set the TextQualified to False on the columns where the issue is happening and used a Derived Column Transformation to remove the surrounding """" data.

    The original bad row that you posted had a comma embedded in a "field", did it not?  If so, this doesn't fix that.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • There is comma also in few fiends Jeff but it did not cause me any issues.
    Below are some wierd data in some of the columns:
    NORTH DAKOTA STATE A" #1"
    DEVONIAN,MADISON,SILURIAN
    324488|33755|0,180041|70622|0,81960|76|0

  • kashyap4007 - Wednesday, August 29, 2018 1:59 PM

    There is comma also in few fiends Jeff but it did not cause me any issues.
    Below are some wierd data in some of the columns:
    NORTH DAKOTA STATE A" #1"
    DEVONIAN,MADISON,SILURIAN
    324488|33755|0,180041|70622|0,81960|76|0

    I guess I don't understand how those commas didn't cause any issues.  I thought the data was comma delimited.  Considering your recent posts, is the data actually tab delimited?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 15 posts - 1 through 15 (of 26 total)

You must be logged in to reply to this topic. Login to reply