Data File Space Cleansing

  • I have a data file that contains an irregular number of spaces directly following the text qualifiers of the last column just before the new line starts.

    "xxx"|"xxx" [][][][][][][][]

    "xxx"|"xxx" [][][][][][][][][][][]

    The file is too big to manually replace the blank spaces and I cannot use the trim function since it needs to be applied on a certain field rather than void blank spaces... Wondering if there is any feature or in SSIS or function i could use in the VB script components to address the issue effectively.

    Regards,

    Samer

  • s.mousharrafieh (11/8/2010)


    The file is too big to manually replace the blank spaces and I cannot use the trim function since it needs to be applied on a certain field rather than void blank spaces...

    Can you explain what you mean by this a little bit more?

    Can you extract the data based on the text qualifier and then trim the columns separately in a derived column transformation component?

  • The file has more than 200,000 records, deleting the blank spaces from every single line manually through notepad/wordpad.... would take hours. As for reading the values from the column & replacing the unnecessary spaces in a derived column doesn't quite work due to the case presented below.

    The file format looks like the following:

    column1 column2 column3

    "value1"|"value2"|"value3" [][][][][][][]

    "value1"|"value2"|"value3" [][][][][][][][][][]

    "value1"|"value2"|"value3" [][][][][]

    [] represents a space character

    Considering the above, the blank spaces at the end of each record will cause the values of the columns to get messed up, thats why i need to treat the spaces before i can even try to set a flat file connection manager based on that file.

    Note that SSIS doesn't treat the blank spaces as a 4th column due to absence of the "|" delimiter and doesn't consider it part of the 3rd column since it is not enclosed within the same " ".

    suggestions?

    Regards,

    Sa,er

  • The file has more than 200,000 records, deleting the blank spaces from every single line manually through notepad/wordpad.... would take hours. As for reading the values from the column & replacing the unnecessary spaces in a derived column doesn't quite work due to the case presented below.

    The file format looks like the following:

    column1 column2 column3

    "value1"|"value2"|"value3" [][][][][][][]

    "value1"|"value2"|"value3" [][][][][][][][][][]

    "value1"|"value2"|"value3" [][][][][]

    [] represents a space character

    Considering the above, the blank spaces at the end of each record will cause the values of the columns to get messed up, thats why i need to treat the spaces before i can even try to set a flat file connection manager based on that file.

    Note that SSIS doesn't treat the blank spaces as a 4th column due to absence of the "|" delimiter and doesn't consider it part of the 3rd column since it is not enclosed within the same " ".

    suggestions?

    Regards,

    Samer

  • Samer: Sorry for all of the questions!

    What exactly is happening in the current state if you open the file? And what do you WANT to happen? What behavior are you expecting to see? Thanks.

    I'm not quite sure how you want to treat the spaces? Included in Column 3? Placed in their own column? Completely excluded?

    Any way that you could provide a small sample with dummy data? Thanks.

  • I need to get rid of the blank spaces at the end of each row before I can put the file in a connection adapter.

    as demonstrated in the image the spaces have been handled in the first row, so the values of the last column is read correctly the rest of the rows on the other hand will cause the package to fail when executed.

  • Have you tried changing your row delimiter?

  • Changing the row delimiter has nothing to do with how the file adapter detects the blank spaces at the end. Moreover I cannot use a flat file connection manager before I get rid of these spaces. A senior developer helped me with this issue where u get to read the file as is in VB, search for the last text qualifier: " on each row and delete all characters following that index on the current row. Problem solved!

    Regards,

    Samer.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply