string of data needs to be parsed to seperate columns

  • i have a string of data that looks like this, there is no set position for where the next value will be retrieved

    one row can look like this

    24W 529 22 132 110 123 -13 516

    another row can look like this

    3102 DENIM 8 99 0 213 213 0 213 312

    what i need to accomplish is be able to retrieve the next value regardless of specific position

    i was using SUBSTRING(ap_vchr_tbl.ap_remt_key, 1, LEN(ap_vchr_tbl.ap_remt_key) - 2)

    but wasnt giving me what i actually need.

  • Here is a fairly common string split function that I use often:

    CREATE FUNCTION strSplit (@arr AS NVARCHAR(MAX), @sep AS NCHAR(1))

    RETURNS TABLE

    AS

    RETURN

    WITH

    L0 AS (SELECT 1 AS C UNION ALL SELECT 1) --2 rows

    ,L1 AS (SELECT 1 AS C FROM L0 AS A, L0 AS B) --4 rows (2x2)

    ,L2 AS (SELECT 1 AS C FROM L1 AS A, L1 AS B) --16 rows (4x4)

    ,L3 AS (SELECT 1 AS C FROM L2 AS A, L2 AS B) --256 rows (16x16)

    ,L4 AS (SELECT 1 AS C FROM L3 AS A, L3 AS B) --65536 rows (256x256)

    ,L5 AS (SELECT 1 AS C FROM L4 AS A, L4 AS B) --4,294,967,296 rows (65536x65536)

    ,Numbers AS (SELECT row_number() OVER (ORDER BY (SELECT 0)) AS N FROM L5)

    SELECT

    (n - 1) - LEN(REPLACE(LEFT(@arr, n-1), @sep, N'')) + 1 AS pos,

    LTRIM(SUBSTRING(@arr, n, CHARINDEX(@sep, @arr + @sep, n) - n)) AS element

    FROM Numbers

    WHERE n <= LEN(@arr) + 1

    AND SUBSTRING(@sep + @arr, n, 1) = @sep

    AND Numbers.n <= 1000

    You would call it like so:

    SELECT * FROM dbo.strSplit(@variable, ' ') -- space is your seperator

    Or as a set-based operation:

    SELECT [Table].[IDField], element

    FROM [Table]

    CROSS APPLY dbo.strSplit([Table].[ArrayField], ' ')

    ORDER BY pos

    Which would return one row per element in the array field, times each row in the source table.

  • I'm guessing your values are separated by a space?

    If so, search this site for a split string function (e.g. search for "DelimitedSplit8K").

    This function is the fastest (T-SQL) way I know of when splitting a delimited list.

    There is a CLR solution around that seems to be even faster.

    You might want to have a look at Flo's blog for a more detailed comparison/explanation. (@Paul, if you come across this thread: did I provide a correct answer this time? :-D)



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • thanks that helps but what i need to finally accomplish is have each value added to a seperate column in a new table.

  • How do you intend to do that unless there is a fixed number of columns? If you know how many columns there will be, you can PIVOT the data. Otherwise, the problem because a great deal more complicated.

  • If you know the number of values you could use the CrossTab method as an alternative to PIVOT.

    The advantage: Once you know the basics behind CrossTab you could explore DynamicCrossTab to get a unknown number of (distinct) values.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Either way you slice it, it's all going to start with that split function. It's simply one of the best ways to break apart a string like that. Once you have your data in rows, you can then look into transposing them.

    I'd be happy to help you figure out how to get the data into columns, but I need more information about your requirements. Based on the two peices of data you posted earlier, what would the end result look like?

  • This is part 2 of a set of articles written by Jeff Moden. It covers dynamic cross tabs: http://qa.sqlservercentral.com/articles/Crosstab/65048/.

    Check it out. What ever you do, it will start with the delimited split function.

  • i have files that get sent to us i believe in a flat file(not sure yet). someone them puts them in excel. then creates a table from them.(not sure of the script they use for this) then data in the table ends up being in one colum as follows;

    20W 0 27 412 385 83 302 302

    or

    9308 NAVY 6P 0 7 436 429 40 389 389

    i was given the task of taking the original file(which hasnt been received yet) and create sql tables that can be used for reporting purposes, and others.

    my thoughts were to take the row and use the substring to pulll the data by position and create a table from that. i was hoping i could use the instr function to search for the values between the blank space, retrieve the value and create a table. sounds simpliar than it is because it isnt standard spacing. im not a newbie but always learning. thanks for all your help. 😀

  • You can use INSTR, but it will be fairly efficient. It definitely works, but you really need to call it continuously until the entire string is parsed. This means you need either a WHILE LOOP/CURSOR, or a table-valued function. You don't want to use a loop because then you're operating on each row one at a time. You want a inline table-valued function, well known as the most efficient of all the SQL Server programability tools. Instead of using INSTR, the function I provided breaks down the string into individual characters and performs set-based operations on that table of characters. This allows the entire "loop" to be contained in a single query, which can be expanded inline. I know that it looks a little cryptic, but it works (well) and it's wicked fast. Aside from using a CLR split function, which adds another layer of complexity, it's one of the fastest string split functions you'll find. You don't need to know how it works exactly, because this function has been vetted by thousands of programmers. It's a well known, well published solution to exactly the problem you are looking at.

    I'm still not sure I have all the information, but it looks like you have two different types of strings, and I'm guess they end up in two different tables? If that's the case, you want to break them up before you even start. Probably the most consistant way would be to break up the string and count the number of elements:

    SELECT *

    FROM [Table] T

    LEFT JOIN (

    SELECT [Table].[IDField], COUNT(*) AS ItemCount

    FROM [Table] CROSS APPLY dbo.strSplit([Table].[ArrayField])

    GROUP BY [Table].[IDField]

    ) IC ON T.[IDField] = IC.[IDField]

    WHERE IC.ItemCount = <number of fields>

    That would break the table down into only rows with an array with the number of elements you are looking for. One you have that, you want to split the array again and then cross tab it:

    SELECT

    [Table].[IDField],

    MAX(CASE WHEN pos = 1 THEN element ELSE NULL END) AS Field1,

    MAX(CASE WHEN pos = 2 THEN element ELSE NULL END) AS Field2,

    ...

    FROM

    [Table]

    CROSS APPLY dbo.strSplit([Table].[ArrayField])) A

    GROUP BY A.[IDField]

    To put those together, replace [Table] in the inner query above with the entire first query, or use a CTE. The CTE is probably a little more readible:

    ;WITH cte AS (

    SELECT *

    FROM [Table] T

    LEFT JOIN (

    SELECT [Table].[IDField], COUNT(*) AS ItemCount

    FROM [Table] CROSS APPLY dbo.strSplit([Table].[ArrayField])

    GROUP BY [Table].[IDField]

    ) IC ON T.[IDField] = IC.[IDField]

    WHERE IC.ItemCount = <number of fields>

    )

    INSERT INTO [NewTable]

    SELECT

    [Table].[IDField],

    MAX(CASE WHEN pos = 1 THEN element ELSE NULL END) AS Field1,

    MAX(CASE WHEN pos = 2 THEN element ELSE NULL END) AS Field2,

    ...

    FROM

    [Table]

    CROSS APPLY dbo.strSplit([Table].[ArrayField])) A

    GROUP BY A.[IDField]

    If you had two different types of arrays, you would create two of these, replacing <number of fields> with the number of fields you expect, and Field1,Field2,... with the proper table schema. Of course [NewTable] also has to be the actual table you want to insert data into.

    Even if I misunderstood your needs, this should give you some pretty good ideas about what you can do.

    --J

  • sharonmtowler (7/22/2010)


    i have files that get sent to us i believe in a flat file(not sure yet). someone them puts them in excel. then creates a table from them.(not sure of the script they use for this) then data in the table ends up being in one colum as follows;

    20W 0 27 412 385 83 302 302

    or

    9308 NAVY 6P 0 7 436 429 40 389 389

    i was given the task of taking the original file(which hasnt been received yet) and create sql tables that can be used for reporting purposes, and others.

    my thoughts were to take the row and use the substring to pulll the data by position and create a table from that. i was hoping i could use the instr function to search for the values between the blank space, retrieve the value and create a table. sounds simpliar than it is because it isnt standard spacing. im not a newbie but always learning. thanks for all your help. 😀

    Did you even care to read the posts prior to your reply from above???

    Kinda strange to repeat your question at this point... :ermm:



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • LutzM (7/22/2010)


    sharonmtowler (7/22/2010)


    i have files that get sent to us i believe in a flat file(not sure yet). someone them puts them in excel. then creates a table from them.(not sure of the script they use for this) then data in the table ends up being in one colum as follows;

    20W 0 27 412 385 83 302 302

    or

    9308 NAVY 6P 0 7 436 429 40 389 389

    i was given the task of taking the original file(which hasnt been received yet) and create sql tables that can be used for reporting purposes, and others.

    my thoughts were to take the row and use the substring to pulll the data by position and create a table from that. i was hoping i could use the instr function to search for the values between the blank space, retrieve the value and create a table. sounds simpliar than it is because it isnt standard spacing. im not a newbie but always learning. thanks for all your help. 😀

    Did you even care to read the posts prior to your reply from above???

    Kinda strange to repeat your question at this point... :ermm:

    I'm guessing here, but no. Looks like the OP just wants an answer without understanding how it works. :ermm:

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply