SSIS - importing flat files - how to specify date format

  • I get a lot of "blonde jokes" so I take things a little more personally than most 😉

  • Never fear...

    There is no such thing as a stupid question. EVER.

    Really, when I started on SSIS (some 7 whole weeks ago) it took all of 3 weeks to deliver the first meaningful component, and on some of the tasks in it I spent more than a day googling. Stuff that is now easy to do, but was impossible then to do...

    ~PD

  • AndyD (6/26/2008)


    I couldn't find exactly what I was looking for (VB code for an SSIS project), but you may find that pure T-SQL will accomplish what you need. For example:

    declare @myDate varchar(10)

    declare @myISOdate smalldatetime

    set @myDate = '11-23-2007'

    set @myISOdate = (select substring(@myDate, 7, 4) + '-' + substring(@myDate, 1, 2) + '-' + substring(@myDate, 4, 2))

    select @myISOdate

    Can you use this code snippet in your data import routine?

    Andy

    Thanks Andy, thats looks fantastic... Sadly.. I'm not a VB programmer at all. I'm using the SQL Server business intelligence development studio.

    I know that I can use a TSQL convert within SQL, but don't know to do that in SSIS and add that via the GUI/IDE SQL Server business intelligence development studio.

    However, if you are saying that SSIS is reading your data in the incorrect format (mm/dd/yyyy as opposed to dd/mm/yyyy), perhaps start by checking your regional settings from where the SSIS is running.

    ~PD

    I hadn't considered local settings at all. (do you mean Windows/PC?) Never occurred to me... And to be honest I get squaemish at the thought of PC/Windows setting dictating how my data should be read... and what would happen if you get data from different sources in different formats?

    Someone mentioned bulk copying. Is that like the old Sybase BCP utility? If I could use something like BCP or Oracle SQL-loader and sidestep SSIS I would be a very happy man.

    FYI: What I have configured at the moment works. (fingies-x'ed) But I'd really like to understand how to implement this correctly.

  • pduplessis (6/26/2008)


    SUBSTRING also works in SSIS (*use as a derived column) .

    Hmmm this sounds like in might be getting closer what i'm after.. what about a convert?

    However, if you are saying that SSIS is reading your data in the incorrect format (mm/dd/yyyy as opposed to dd/mm/yyyy), perhaps start by checking your regional settings from where the SSIS is running.

    I have dates in a csv file in mm/dd/yyyy format. But SSIS was reading them as dd/mm/yyy.

    This had the confusing and hilarious effect of interpreting dates inconsistently. Any dates where the day was greater tehan the 12th of the month would be correctly interpreted as dd/mm/yyyy but anything less than that would be backwards.

    So 05/21/2008 would be correctly read as the 21st of May 2008.

    But 01/06/2008 would happily be interpreted as 1st of June 2008.

  • You have a couple of choices. For some people, the easiest way to import data is to do a BCP (bulk copy) into a temporary table, and then copy the data from the temporary table to the permanent table. Rules and transformations can be applied to the data during the copy. I am not a big fan of creating a 2-stage process where 1-stage should suffice, but it definitely has its advantages. If you don't want to get involved with SSIS, this method should suit you.

    If you want to try this, then using SSMS, select the database you want to import the data to, right-click and choose TASKS | IMPORT DATA. The wizard that starts will create a BCP for you (which you can save as an SSIS package).

    Once you have bulk-copied the data into a temporary table, put your T-SQL writing hat on 🙂

  • Hi All,

    I'm relatively new to SSIS. I am trying to import a csv file into my DB using SSIS. The problem I am facing is that I have unicode characters in my source file and the SSIS package that I create is not able to read the delimiters when I specify the input to be unicode (the little checkbox when creating the datasource). If i do not specify unicode, then I am able to preview the data that I have in the csv file. But when I specify unicode and try to preview, it says

    TITLE: Microsoft Visual Studio

    ------------------------------

    The specified header or data row delimiter "Comma {,}" is not found after scanning 65536 bytes of the file "C:\Documents and Settings\kkql822\Desktop\PD_PDinstanceUS.txt".

    Do you want to continue scanning this file?

    This question might have been answered previously but I have been unable to solve this problem (after several google searches).

  • AndyD (6/27/2008)


    You have a couple of choices. For some people, the easiest way to import data is to do a BCP (bulk copy) into a temporary table, and then copy the data from the temporary table to the permanent table. Rules and transformations can be applied to the data during the copy. I am not a big fan of creating a 2-stage process where 1-stage should suffice, but it definitely has its advantages. If you don't want to get involved with SSIS, this method should suit you.

    If you want to try this, then using SSMS, select the database you want to import the data to, right-click and choose TASKS | IMPORT DATA. The wizard that starts will create a BCP for you (which you can save as an SSIS package).

    Once you have bulk-copied the data into a temporary table, put your T-SQL writing hat on 🙂

    Hey that sounds great. Thank you. I think it might be the most suitable approach for me. Provided that I can process multiple files easily, I'll be a very happy bunny.

    Thanks again one and all for your suggestions.

    Regards

    Peter

  • SSIS is not a SQL Tool but a .Net tool therefore how it works is more akin to C# or VB

  • Since the Microsoft web site acknowledges that this can be a problem I use the derived column task to change the format to yyyy-mm-dd. This is the only format that works consistently, for me anyway.

    Formula:RIGHT(date_of_birth,4) + "-" + SUBSTRING(date_of_birth,FINDSTRING(date_of_birth,"/",1) + 1,LEN(date_of_birth) - FINDSTRING(date_of_birth,"/",2) - 2) + "-" + SUBSTRING(date_of_birth,1,FINDSTRING(date_of_birth,"/",1) - 1)

    Assuming the date field to be date_of_birth and the format of the string is dd/mm/yyyy

    You can then cast the result as a dt_date type, you have to look for blank columns as well as the derived column formula spits the dummy on empty columns

  • Thanks for the formula.. it's exactly what i needed!!

  • andymay-496937 (11/24/2010)


    Since the Microsoft web site acknowledges that this can be a problem I use the derived column task to change the format to yyyy-mm-dd. This is the only format that works consistently, for me anyway.

    Formula:RIGHT(date_of_birth,4) + "-" + SUBSTRING(date_of_birth,FINDSTRING(date_of_birth,"/",1) + 1,LEN(date_of_birth) - FINDSTRING(date_of_birth,"/",2) - 2) + "-" + SUBSTRING(date_of_birth,1,FINDSTRING(date_of_birth,"/",1) - 1)

    Assuming the date field to be date_of_birth and the format of the string is dd/mm/yyyy

    You can then cast the result as a dt_date type, you have to look for blank columns as well as the derived column formula spits the dummy on empty columns

    Sorry. Where do i have to insert the mentioned code? I tried it in many places and it doesn't work.

    Thanks.

  • Sorry to resurrect but thought the solution to this may interest any future googlers. Andy pretty much had it with SSIS being based on .net, so the way to set date format in .net is to change the regional culture of the thread. If you put this line of code into a script task in the control flow, it should solve your problem.

    System.Threading.Thread.CurrentThread.CurrentCulture = New System.Globalization.CultureInfo("en-US", True)

    This works for 2005 and 2008 and you can find a list of culture names here:http://www.csharp-examples.net/culture-names/

    Hope this helps

    Laurence

  • Hi

    My problem was exactly the same as and exactly opposite from the original poster's - date in text file is dd/mm/yyyy and imports to mm-dd-yyyy when dd is <=12, while I want it to be yyyy-mm-dd.

    Tried the culture change in code, but either i was doing it wrong (likely), or it doesn't work, or I picked the wrong one (en-ZA).

    In the end I just imported the date as text (char(10)) and manipulated the string to rearrange the date components.

    I'm sure there is at least one more elegant solution out there, but sometimes you just have to find an answer.

    Lidia

  • Hi,

    i have one idea,

    1) if source is csv file then date datatype is dt_wstr

    2) in dervied transformation take substring(date,3,4)+"_"+substring(date,1,2)+"_"+substring(date,6,9)

    format will be to dd-mm-yyyy

    please check it.

    if working fine it's good,otherwise sorry

  • Lidia Joubert (2/29/2012)


    Hi

    Tried the culture change in code, but either i was doing it wrong (likely), or it doesn't work, or I picked the wrong one (en-ZA).

    Lidia

    Alas, I found out later that using a script task in the control flow produced unpredictable results. It seemed to work for smaller packages, but once the thread count increased, it seemed to produce questionable results. I'm fairly sure that if you can set this on the base thread for the parent on package load, the subsequent threads will inherit it's culture accordingly. Unfortunately, real life and deadlines get in the way of hacking around in SSIS so I never found a stable way to make this work in a global way so apologies for not updating the thread.

    You can however use this method in a script component as a transformation in the dataflow and define the culture directly, like so:

    using System;

    using System.Data;

    using Microsoft.SqlServer.Dts.Pipeline.Wrapper;

    using Microsoft.SqlServer.Dts.Runtime.Wrapper;

    using System.Globalization;

    [Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]

    public class ScriptMain : UserComponent

    {

    CultureInfo DateFormat = new CultureInfo("en-US", true);

    public override void Input0_ProcessInputRow(Input0Buffer Row)

    {

    Row.Date = DateTime.Parse(Row.DateString, DateFormat);

    }

    }

    Of course, another way around this would be to use an OLEDB Command with a convert statement in T-SQL. I think I prefer the above method though.

    Hope this helps

    L

Viewing 15 posts - 16 through 30 (of 34 total)

You must be logged in to reply to this topic. Login to reply