Technical Article

Delimited String Parsing Functions - Big2D set

,

Delimited String Parsing Functions - Big2D set
by Jesse Roberge - YeshuaAgapao@gmail.com
Update: Added robustness for NULL inputs and made it return no rows on blank inputs.

Feed it large strings of double-delimited horizontal data and it returns it back as a non-pivoted vertical table with a 2-diemensional star schema.
The Big2D function set supports more than 8000 character delimited strings, but the individual elements must be 8000 characters or less.
If you like performance you don't need to process delimited strings over 8000 characters, then use the 2D function set instead of the Big2D function set.
Requires a table of numbers. These functions expect it to be called 'Counter' in the same database that you save these functions to.
Search for 'Counter table (table of numbers) setter-upper for SQL Server 2005' or Counter table (table of numbers) setter-upper for SQL Server 2000' if you need a script to set this up for you.
SQL Server 2005 only.

Variants:
Array Has array position index and value data is not casted.
Table No array position index and value data is not casted.
IntArray Has array position index and value data is casted to int.
IntTable No array position index and value data is casted to int.
In the Big2D delimiter function set, the table variants have some performance gain over the array variants, but are not very useful except in joins.

Usage:
SELECT * FROM fn_DelimitToArray_Big2D ('red^square^square1^square2,green^rectangle^rectangle1^rectangle2,yellow^circle^circle1^circle2,blue^triangle^triangle1^triangle2,orange^oval^oval1^oval2,purple^hexagon^hexagon1^hexagon2',',','^') AS Delimit
SELECT * FROM fn_DelimitToIntArray_Big2D ('11^111^1111,22^222^2222,33^333^3333,44^444^4444,55^555^5555',',','^') AS Delimit
SELECT * FROM fn_DelimitToIntTable_Big2D ('11^111^1111,22^222^2222,33^333^3333,44^444^4444,55^555^5555',',','^') AS Delimit
SELECT * FROM fn_DelimitToTable_Big2D ('red^square^square1^square2,green^rectangle^rectangle1^rectangle2,yellow^circle^circle1^circle2,blue^triangle^triangle1^triangle2,orange^oval^oval1^oval2,purple^hexagon^hexagon1^hexagon2',',','^') AS Delimit

Copyright:
Licensed under the L-GPL - a weak copyleft license - you are permitted to use this as a component of a proprietary database and call this from proprietary software.
Copyleft lets you do anything you want except plagarize, conceal the source, or prohibit copying & re-distribution of this script/proc.

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License as
published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Lesser General Public License for more details.

see <http://www.fsf.org/licensing/licenses/lgpl.html> for the license text.

SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
GO

--*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

/*
Delimited String Parsing Functions - Big2D set
by Jesse Roberge - YeshuaAgapao@gmail.com
Update: Added robustness for NULL inputs and made it return no rows on blank inputs.

Feed it large strings of double-delimited horizontal data and it returns it back as a non-pivoted vertical table with a 2-diemensional star schema.
The Big2D function set supports more than 8000 character delimited strings, but the individual elements must be 8000 characters or less.
If you like performance you don't need to process delimited strings over 8000 characters, then use the 2D function set instead of the Big2D function set.
Requires a table of numbers.  These functions expect it to be called 'Counter' in the same database that you save these functions to.
Search for 'Counter table (table of numbers) setter-upper for SQL Server 2005' or Counter table (table of numbers) setter-upper for SQL Server 2000' if you need a script to set this up for you.
SQL Server 2005 only.

Variants:
	Array		Has array position index and value data is not casted.
	Table		No array position index and value data is not casted.
	IntArray	Has array position index and value data is casted to int.
	IntTable	No array position index and value data is casted to int.
In the Big2D delimiter function set, the table variants have some performance gain over the array variants, but are not very useful except in joins.

Usage:
	SELECT * FROM fn_DelimitToArray_Big2D ('red^square^square1^square2,green^rectangle^rectangle1^rectangle2,yellow^circle^circle1^circle2,blue^triangle^triangle1^triangle2,orange^oval^oval1^oval2,purple^hexagon^hexagon1^hexagon2',',','^') AS Delimit
	SELECT * FROM fn_DelimitToIntArray_Big2D ('11^111^1111,22^222^2222,33^333^3333,44^444^4444,55^555^5555',',','^') AS Delimit
	SELECT * FROM fn_DelimitToIntTable_Big2D ('11^111^1111,22^222^2222,33^333^3333,44^444^4444,55^555^5555',',','^') AS Delimit
	SELECT * FROM fn_DelimitToTable_Big2D ('red^square^square1^square2,green^rectangle^rectangle1^rectangle2,yellow^circle^circle1^circle2,blue^triangle^triangle1^triangle2,orange^oval^oval1^oval2,purple^hexagon^hexagon1^hexagon2',',','^') AS Delimit

Copyright:
	Licensed under the L-GPL - a weak copyleft license - you are permitted to use this as a component of a proprietary database and call this from proprietary software.
	Copyleft lets you do anything you want except plagarize, conceal the source, or prohibit copying & re-distribution of this script/proc.

	This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU Lesser General Public License as
    published by the Free Software Foundation, either version 3 of the
    License, or (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU Lesser General Public License for more details.

    see <http://www.fsf.org/licensing/licenses/lgpl.html> for the license text.
*/

--*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

IF OBJECT_ID('dbo.fn_DelimitToArray_Big') IS NOT NULL DROP FUNCTION dbo.fn_DelimitToArray_Big
GO

CREATE FUNCTION dbo.fn_DelimitToArray_Big2D
	(
		@String text,
		@Delimiter VarChar(1),
		@Delimiter2 VarChar(1)
	)
RETURNS @T TABLE
	(
		RowPos int NOT NULL,
		ColPos int NOT NULL,
		Value VarChar(8000) NOT NULL
	)
AS

BEGIN

	DECLARE @Slices Table
	(
		Slice VarChar(8000) NOT NULL,
		CumulativeElementCount int NOT NULL
	)

	DECLARE @Slice VarChar(8000)
	DECLARE @TextPos int
	DECLARE @MaxLength int
	DECLARE @StopPos int
	DECLARE @StringLength int
	DECLARE @CumulativeElementCount int
	SELECT @TextPos = 1, @MaxLength = 8000 - 2, @CumulativeElementCount=0
	SELECT @StringLength=ISNULL(DATALENGTH(@String),0)-@MaxLength

	WHILE @TextPos < @StringLength
	BEGIN
		SELECT @Slice = SUBSTRING(@String, @TextPos, @MaxLength)
		SELECT @StopPos = @MaxLength - CHARINDEX(@Delimiter, REVERSE(@Slice))

		INSERT INTO @Slices (Slice, CumulativeElementCount) VALUES (@Delimiter + LEFT(@Slice, @StopPos) + @Delimiter, @CumulativeElementCount)

		SELECT @CumulativeElementCount=@CumulativeElementCount+LEN(@Slice)-LEN(REPLACE(@Slice, @Delimiter, ''))
		SELECT @TextPos = @TextPos + @StopPos + 1
	END
	IF @StringLength>0-@MaxLength INSERT INTO @Slices (Slice, CumulativeElementCount) VALUES (@Delimiter + SUBSTRING(@String, @TextPos, @MaxLength) + @Delimiter, @CumulativeElementCount);

	INSERT INTO @T (RowPos, ColPos, Value)
	SELECT Counter1st.Pos AS RowPos, Counter2nd.Pos AS ColPos, Counter2nd.Value
	FROM
		(
			SELECT
				PK_CountID - LEN(REPLACE(LEFT(Slices.Slice, PK_CountID-1), @Delimiter, '')) + Slices.CumulativeElementCount AS Pos,
				SUBSTRING(Slices.Slice, Counter.PK_CountID + 1, CHARINDEX(@Delimiter, Slices.Slice, Counter.PK_CountID + 1) - Counter.PK_CountID - 1) AS Value
			FROM
				dbo.Counter WITH (NOLOCK)
				JOIN @Slices AS Slices ON
					Counter.PK_CountID>0 AND Counter.PK_CountID <= LEN(Slices.Slice) - 1 AND
					SUBSTRING(Slices.Slice, Counter.PK_CountID, 1) = @Delimiter
		) AS Counter1st
		CROSS APPLY (
			SELECT
				PK_CountID - LEN(REPLACE(LEFT(Counter1st.Value, PK_CountID-1), @Delimiter2, '')) AS Pos,
				SUBSTRING(Counter1st.Value+@Delimiter2, PK_CountID, CHARINDEX(@Delimiter2, Counter1st.Value+@Delimiter2, PK_CountID)-PK_CountID) AS Value
			FROM dbo.counter WITH (NOLOCK)
			WHERE PK_CountID >0 AND PK_CountID<LEN(Counter1st.Value)+LEN(@Delimiter2) AND SubString(@Delimiter2 + Counter1st.Value + @Delimiter2, PK_CountID, 1)=@Delimiter2
		) AS Counter2nd
	RETURN

END
GO

--*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

IF OBJECT_ID('dbo.fn_DelimitToIntArray_Big2D') IS NOT NULL DROP FUNCTION dbo.fn_DelimitToIntArray_Big2D
GO

CREATE FUNCTION dbo.fn_DelimitToIntArray_Big2D
	(
		@String text,
		@Delimiter VarChar(1),
		@Delimiter2 VarChar(1)
	)
RETURNS @T TABLE
	(
		RowPos int NOT NULL,
		ColPos int NOT NULL,
		PK_IntID int NOT NULL
	)
AS

BEGIN

	DECLARE @Slices Table
	(
		Slice VarChar(8000) NOT NULL,
		CumulativeElementCount int NOT NULL
	)

	DECLARE @Slice VarChar(8000)
	DECLARE @TextPos int
	DECLARE @MaxLength int
	DECLARE @StopPos int
	DECLARE @StringLength int
	DECLARE @CumulativeElementCount int
	SELECT @TextPos = 1, @MaxLength = 8000 - 2, @CumulativeElementCount=0
	SELECT @StringLength=ISNULL(DATALENGTH(@String),0)-@MaxLength

	WHILE @TextPos < @StringLength
	BEGIN
		SELECT @Slice = SUBSTRING(@String, @TextPos, @MaxLength)
		SELECT @StopPos = @MaxLength - CHARINDEX(@Delimiter, REVERSE(@Slice))

		INSERT INTO @Slices (Slice, CumulativeElementCount) VALUES (@Delimiter + LEFT(@Slice, @StopPos) + @Delimiter, @CumulativeElementCount)

		SELECT @CumulativeElementCount=@CumulativeElementCount+LEN(@Slice)-LEN(REPLACE(@Slice, @Delimiter, ''))
		SELECT @TextPos = @TextPos + @StopPos + 1
	END
	IF @StringLength>0-@MaxLength INSERT INTO @Slices (Slice, CumulativeElementCount) VALUES (@Delimiter + SUBSTRING(@String, @TextPos, @MaxLength) + @Delimiter, @CumulativeElementCount);

	INSERT INTO @T (RowPos, ColPos, PK_IntID)
	SELECT Counter1st.Pos AS RowPos, Counter2nd.Pos AS ColPos, CONVERT(int, Counter2nd.Value) AS PK_IntID
	FROM
		(
			SELECT
				PK_CountID - LEN(REPLACE(LEFT(Slices.Slice, PK_CountID-1), @Delimiter, '')) + Slices.CumulativeElementCount AS Pos,
				SUBSTRING(Slices.Slice, Counter.PK_CountID + 1, CHARINDEX(@Delimiter, Slices.Slice, Counter.PK_CountID + 1) - Counter.PK_CountID - 1) AS Value
			FROM
				dbo.Counter WITH (NOLOCK)
				JOIN @Slices AS Slices ON
					Counter.PK_CountID>0 AND Counter.PK_CountID <= LEN(Slices.Slice) - 1 AND
					SUBSTRING(Slices.Slice, Counter.PK_CountID, 1) = @Delimiter
		) AS Counter1st
		CROSS APPLY (
			SELECT
				PK_CountID - LEN(REPLACE(LEFT(Counter1st.Value, PK_CountID-1), @Delimiter2, '')) AS Pos,
				SUBSTRING(Counter1st.Value+@Delimiter2, PK_CountID, CHARINDEX(@Delimiter2, Counter1st.Value+@Delimiter2, PK_CountID)-PK_CountID) AS Value
			FROM dbo.counter WITH (NOLOCK)
			WHERE PK_CountID >0 AND PK_CountID<LEN(Counter1st.Value)+LEN(@Delimiter2) AND SubString(@Delimiter2 + Counter1st.Value + @Delimiter2, PK_CountID, 1)=@Delimiter2
		) AS Counter2nd
	RETURN
END
GO

--*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

IF OBJECT_ID('dbo.fn_DelimitToIntTable_Big2D') IS NOT NULL DROP FUNCTION dbo.fn_DelimitToIntTable_Big2D
GO

CREATE FUNCTION dbo.fn_DelimitToIntTable_Big2D
	(
		@String text,
		@Delimiter VarChar(1),
		@Delimiter2 VarChar(1)
	)
RETURNS @T TABLE
	(
		PK_IntID int NOT NULL
	)
AS

BEGIN

	DECLARE @Slices Table
	(
		Slice VarChar(8000) NOT NULL
	)

	DECLARE @Slice VarChar(8000)
	DECLARE @TextPos int
	DECLARE @MaxLength int
	DECLARE @StopPos int
	DECLARE @StringLength int
	SELECT @TextPos = 1, @MaxLength = 8000 - 2
	SELECT @StringLength=ISNULL(DATALENGTH(@String),0)-@MaxLength

	WHILE @TextPos < @StringLength
	BEGIN
		SELECT @Slice = SUBSTRING(@String, @TextPos, @MaxLength)
		SELECT @StopPos = @MaxLength - CHARINDEX(@Delimiter, REVERSE(@Slice))

		INSERT INTO @Slices (Slice) VALUES (@Delimiter + LEFT(@Slice, @StopPos) + @Delimiter)

		SELECT @TextPos = @TextPos + @StopPos + 1
	END
	IF @StringLength>0-@MaxLength INSERT INTO @Slices (slice) VALUES (@Delimiter + SUBSTRING(@String, @TextPos, @MaxLength) + @Delimiter);

	INSERT INTO @T (PK_IntID)
	SELECT CONVERT(int, Counter2nd.Value) AS PK_IntID
	FROM
		(
			SELECT
				SUBSTRING(Slices.Slice, Counter.PK_CountID + 1, CHARINDEX(@Delimiter, Slices.Slice, Counter.PK_CountID + 1) - Counter.PK_CountID - 1) AS Value
			FROM
				dbo.Counter WITH (NOLOCK)
				JOIN @Slices AS Slices ON
					Counter.PK_CountID>0 AND Counter.PK_CountID <= LEN(Slices.Slice) - 1 AND
					SUBSTRING(Slices.Slice, Counter.PK_CountID, 1) = @Delimiter
		) AS Counter1st
		CROSS APPLY (
			SELECT
				SUBSTRING(Counter1st.Value+@Delimiter2, PK_CountID, CHARINDEX(@Delimiter2, Counter1st.Value+@Delimiter2, PK_CountID)-PK_CountID) AS Value
			FROM dbo.counter WITH (NOLOCK)
			WHERE PK_CountID >0 AND PK_CountID<LEN(Counter1st.Value)+LEN(@Delimiter2) AND SubString(@Delimiter2 + Counter1st.Value + @Delimiter2, PK_CountID, 1)=@Delimiter2
		) AS Counter2nd
	RETURN
END
GO

--*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

IF OBJECT_ID('dbo.fn_DelimitToTable_Big2D') IS NOT NULL DROP FUNCTION dbo.fn_DelimitToTable_Big2D
GO

CREATE FUNCTION dbo.fn_DelimitToTable_Big2D
	(
		@String text,
		@Delimiter VarChar(1),
		@Delimiter2 VarChar(1)
	)
RETURNS @T TABLE
	(
		Value VarChar(8000) NOT NULL
	)
AS

BEGIN

	DECLARE @Slices Table
	(
		Slice VarChar(8000) NOT NULL
	)

	DECLARE @Slice VarChar(8000)
	DECLARE @TextPos int
	DECLARE @MaxLength int
	DECLARE @StopPos int
	DECLARE @StringLength int
	SELECT @TextPos = 1, @MaxLength = 8000 - 2
	SELECT @StringLength=ISNULL(DATALENGTH(@String),0)-@MaxLength

	WHILE @TextPos < @StringLength
	BEGIN
		SELECT @Slice = SUBSTRING(@String, @TextPos, @MaxLength)
		SELECT @StopPos = @MaxLength - CHARINDEX(@Delimiter, REVERSE(@Slice))

		INSERT INTO @Slices (Slice) VALUES (@Delimiter + LEFT(@Slice, @StopPos) + @Delimiter)

		SELECT @TextPos = @TextPos + @StopPos + 1
	END
	IF @StringLength>0-@MaxLength INSERT INTO @Slices (slice) VALUES (@Delimiter + SUBSTRING(@String, @TextPos, @MaxLength) + @Delimiter);

	INSERT INTO @T (Value)
	SELECT Counter2nd.Value
	FROM
		(
			SELECT
				SUBSTRING(Slices.Slice, Counter.PK_CountID + 1, CHARINDEX(@Delimiter, Slices.Slice, Counter.PK_CountID + 1) - Counter.PK_CountID - 1) AS Value
			FROM
				dbo.Counter WITH (NOLOCK)
				JOIN @Slices AS Slices ON
					Counter.PK_CountID>0 AND Counter.PK_CountID <= LEN(Slices.Slice) - 1 AND
					SUBSTRING(Slices.Slice, Counter.PK_CountID, 1) = @Delimiter
		) AS Counter1st
		CROSS APPLY (
			SELECT
				SUBSTRING(Counter1st.Value+@Delimiter2, PK_CountID, CHARINDEX(@Delimiter2, Counter1st.Value+@Delimiter2, PK_CountID)-PK_CountID) AS Value
			FROM dbo.counter WITH (NOLOCK)
			WHERE PK_CountID >0 AND PK_CountID<LEN(Counter1st.Value)+LEN(@Delimiter2) AND SubString(@Delimiter2 + Counter1st.Value + @Delimiter2, PK_CountID, 1)=@Delimiter2
		) AS Counter2nd
	RETURN
END
GO

--*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

Rate

1.5 (2)

You rated this post out of 5. Change rating

Share

Share

Rate

1.5 (2)

You rated this post out of 5. Change rating