January 16, 2013 at 11:42 am
aochss (1/16/2013)
I would not expect a record back if the string being sent in is NULL, a single space or all spaces. In my case, I have records where the string being split is often blank and I would want no records coming back from the CROSS APPLY. I could filter them out using in the WHERE clause. I fixed this by adding LTRIM and RTRIM to the source string in the function.Also, some of my "not so good" data also had multiple spaces in between the values causing numerous records returned that were blank.
For example:
dbo.DelimitedSplit8k('123 467 789', ' ') -- There are two spaces between 467 and 789
ItemNumber Item
1 123
2 467
3 <-- Empty String
4 789
I modified the function to give an option for removing extra spaces after the delimiter (like '123, 345, 567) and then as you did to your version to check for null/blank input as well as null/blank rows in the output. To do that I had to change the function into a multi-statement tvf.
Due to the additional checking for nulls/blanks and removing blank rows from the returned table I figured efficiency would be seriously compromised. So I tested this revision against the original and it seriously under performs. In the results below the first run of DelimitedSplit8k_NULLS sets the parameter "@bRemoveSpacesAfterDelimiter" to 0 and the 2nd run set it to 1.
I can see where nulls/blanks could be an issue, but I think it would be better to filter out such rows AFTER applying any CROSS APPLYs using the original function as opposed to trying to do the filtering within the function (though I didn't specifically test that). Maybe someone else can come up with a better way of getting rid of nulls/blanks without compromising the efficiency of DelimitedSplit8K?
Test Results:
RowNum SplitterName NumberOfRows NumberOfElements MinElementLength MaxElementLength Duration MinLength AvgLength MaxLength
1 DelimitedSplit8K 1000 1 1 10 0.01300 1 5 10
2 DelimitedSplit8K_NULLS 1000 1 1 10 0.29300 1 5 10
3 DelimitedSplit8K_NULLS 1000 1 1 10 0.30000 1 5 10
4 DelimitedSplit8K 1000 2 1 10 0.01600 3 11 21
5 DelimitedSplit8K_NULLS 1000 2 1 10 0.36000 3 11 21
6 DelimitedSplit8K_NULLS 1000 2 1 10 0.24000 3 11 21
7 DelimitedSplit8K 1000 4 1 10 0.03300 9 25 40
8 DelimitedSplit8K_NULLS 1000 4 1 10 0.40300 9 25 40
9 DelimitedSplit8K_NULLS 1000 4 1 10 0.31000 9 25 40
10 DelimitedSplit8K 1000 8 1 10 0.05600 29 50 74
11 DelimitedSplit8K_NULLS 1000 8 1 10 0.51300 29 50 74
12 DelimitedSplit8K_NULLS 1000 8 1 10 0.58600 29 50 74
13 DelimitedSplit8K 1000 16 1 10 0.11000 72 102 137
14 DelimitedSplit8K_NULLS 1000 16 1 10 0.84300 72 102 137
15 DelimitedSplit8K_NULLS 1000 16 1 10 0.69600 72 102 137
16 DelimitedSplit8K 1000 32 1 10 0.24300 148 207 251
17 DelimitedSplit8K_NULLS 1000 32 1 10 1.37600 148 207 251
18 DelimitedSplit8K_NULLS 1000 32 1 10 1.23600 148 207 251
19 DelimitedSplit8K 1000 64 1 10 0.40300 328 415 482
20 DelimitedSplit8K_NULLS 1000 64 1 10 2.48600 328 415 482
21 DelimitedSplit8K_NULLS 1000 64 1 10 2.29000 328 415 482
22 DelimitedSplit8K 1000 128 1 10 0.82600 734 832 942
23 DelimitedSplit8K_NULLS 1000 128 1 10 4.69600 734 832 942
24 DelimitedSplit8K_NULLS 1000 128 1 10 4.46000 734 832 942
25 DelimitedSplit8K 1000 256 1 10 1.62000 1528 1661 1815
26 DelimitedSplit8K_NULLS 1000 256 1 10 8.88000 1528 1661 1815
27 DelimitedSplit8K_NULLS 1000 256 1 10 8.87300 1528 1661 1815
28 DelimitedSplit8K 1000 512 1 10 3.21000 3124 3322 3493
29 DelimitedSplit8K_NULLS 1000 512 1 10 18.02600 3124 3322 3493
30 DelimitedSplit8K_NULLS 1000 512 1 10 17.05000 3124 3322 3493
31 DelimitedSplit8K 1000 1150 1 10 7.44600 7158 7478 7798
32 DelimitedSplit8K_NULLS 1000 1150 1 10 40.46000 7158 7478 7798
33 DelimitedSplit8K_NULLS 1000 1150 1 10 37.25000 7158 7478 7798
34 DelimitedSplit8K 1000 1 10 20 0.01600 10 14 20
35 DelimitedSplit8K_NULLS 1000 1 10 20 0.20600 10 14 20
36 DelimitedSplit8K_NULLS 1000 1 10 20 0.20600 10 14 20
37 DelimitedSplit8K 1000 2 10 20 0.02000 21 30 41
38 DelimitedSplit8K_NULLS 1000 2 10 20 0.30000 21 30 41
39 DelimitedSplit8K_NULLS 1000 2 10 20 0.30600 21 30 41
40 DelimitedSplit8K 1000 4 10 20 0.04600 46 62 81
41 DelimitedSplit8K_NULLS 1000 4 10 20 0.46000 46 62 81
42 DelimitedSplit8K_NULLS 1000 4 10 20 0.46000 46 62 81
43 DelimitedSplit8K 1000 8 10 20 0.09600 102 126 155
44 DelimitedSplit8K_NULLS 1000 8 10 20 0.74000 102 126 155
45 DelimitedSplit8K_NULLS 1000 8 10 20 0.75000 102 126 155
46 DelimitedSplit8K 1000 16 10 20 0.18300 216 254 292
47 DelimitedSplit8K_NULLS 1000 16 10 20 1.32000 216 254 292
48 DelimitedSplit8K_NULLS 1000 16 10 20 1.32300 216 254 292
49 DelimitedSplit8K 1000 32 10 20 0.36000 458 510 586
50 DelimitedSplit8K_NULLS 1000 32 10 20 2.52300 458 510 586
51 DelimitedSplit8K_NULLS 1000 32 10 20 2.49600 458 510 586
52 DelimitedSplit8K 1000 64 10 20 0.74000 945 1022 1097
53 DelimitedSplit8K_NULLS 1000 64 10 20 4.75600 945 1022 1097
54 DelimitedSplit8K_NULLS 1000 64 10 20 4.79000 945 1022 1097
55 DelimitedSplit8K 1000 128 10 20 1.44000 1944 2047 2156
56 DelimitedSplit8K_NULLS 1000 128 10 20 9.38300 1944 2047 2156
57 DelimitedSplit8K_NULLS 1000 128 10 20 9.44600 1944 2047 2156
58 DelimitedSplit8K 1000 256 10 20 2.88300 3903 4096 4254
59 DelimitedSplit8K_NULLS 1000 256 10 20 18.67000 3903 4096 4254
60 DelimitedSplit8K_NULLS 1000 256 10 20 18.86600 3903 4096 4254
61 DelimitedSplit8K 1000 480 10 20 5.45600 7432 7680 7881
62 DelimitedSplit8K_NULLS 1000 480 10 20 34.67600 7432 7680 7881
63 DelimitedSplit8K_NULLS 1000 480 10 20 35.24600 7432 7680 7881
64 DelimitedSplit8K 1000 1 20 30 0.01300 20 24 30
65 DelimitedSplit8K_NULLS 1000 1 20 30 0.24000 20 24 30
66 DelimitedSplit8K_NULLS 1000 1 20 30 0.24300 20 24 30
67 DelimitedSplit8K 1000 2 20 30 0.02300 41 51 61
68 DelimitedSplit8K_NULLS 1000 2 20 30 0.41300 41 51 61
69 DelimitedSplit8K_NULLS 1000 2 20 30 0.40300 41 51 61
70 DelimitedSplit8K 1000 4 20 30 0.07000 86 103 121
71 DelimitedSplit8K_NULLS 1000 4 20 30 0.66000 86 103 121
72 DelimitedSplit8K_NULLS 1000 4 20 30 0.66000 86 103 121
73 DelimitedSplit8K 1000 8 20 30 0.13600 180 206 232
74 DelimitedSplit8K_NULLS 1000 8 20 30 1.15600 180 206 232
75 DelimitedSplit8K_NULLS 1000 8 20 30 1.19300 180 206 232
76 DelimitedSplit8K 1000 16 20 30 0.27000 379 415 456
77 DelimitedSplit8K_NULLS 1000 16 20 30 2.17300 379 415 456
78 DelimitedSplit8K_NULLS 1000 16 20 30 2.21600 379 415 456
79 DelimitedSplit8K 1000 32 20 30 0.52600 774 830 892
80 DelimitedSplit8K_NULLS 1000 32 20 30 4.18600 774 830 892
81 DelimitedSplit8K_NULLS 1000 32 20 30 4.33000 774 830 892
82 DelimitedSplit8K 1000 64 20 30 1.05600 1582 1663 1747
83 DelimitedSplit8K_NULLS 1000 64 20 30 8.30000 1582 1663 1747
84 DelimitedSplit8K_NULLS 1000 64 20 30 8.37300 1582 1663 1747
85 DelimitedSplit8K 1000 128 20 30 2.19300 3208 3325 3442
86 DelimitedSplit8K_NULLS 1000 128 20 30 16.45300 3208 3325 3442
87 DelimitedSplit8K_NULLS 1000 128 20 30 16.68600 3208 3325 3442
88 DelimitedSplit8K 1000 256 20 30 4.22000 6484 6657 6834
89 DelimitedSplit8K_NULLS 1000 256 20 30 32.53000 6484 6657 6834
90 DelimitedSplit8K_NULLS 1000 256 20 30 32.85000 6484 6657 6834
91 DelimitedSplit8K 1000 290 20 30 4.76000 7369 7539 7712
92 DelimitedSplit8K_NULLS 1000 290 20 30 37.29000 7369 7539 7712
93 DelimitedSplit8K_NULLS 1000 290 20 30 38.92000 7369 7539 7712
94 DelimitedSplit8K 1000 1 30 40 0.01300 30 34 40
95 DelimitedSplit8K_NULLS 1000 1 30 40 0.27300 30 34 40
96 DelimitedSplit8K_NULLS 1000 1 30 40 0.28000 30 34 40
97 DelimitedSplit8K 1000 2 30 40 0.02600 61 70 81
98 DelimitedSplit8K_NULLS 1000 2 30 40 0.50300 61 70 81
99 DelimitedSplit8K_NULLS 1000 2 30 40 0.51000 61 70 81
100 DelimitedSplit8K 1000 4 30 40 0.09300 124 142 158
101 DelimitedSplit8K_NULLS 1000 4 30 40 0.89300 124 142 158
102 DelimitedSplit8K_NULLS 1000 4 30 40 0.93600 124 142 158
103 DelimitedSplit8K 1000 8 30 40 0.18600 262 286 314
104 DelimitedSplit8K_NULLS 1000 8 30 40 1.66300 262 286 314
105 DelimitedSplit8K_NULLS 1000 8 30 40 1.67600 262 286 314
106 DelimitedSplit8K 1000 16 30 40 0.35000 529 574 613
107 DelimitedSplit8K_NULLS 1000 16 30 40 3.19300 529 574 613
108 DelimitedSplit8K_NULLS 1000 16 30 40 3.25000 529 574 613
109 DelimitedSplit8K 1000 32 30 40 0.71300 1097 1150 1222
110 DelimitedSplit8K_NULLS 1000 32 30 40 7.12600 1097 1150 1222
111 DelimitedSplit8K_NULLS 1000 32 30 40 6.80300 1097 1150 1222
112 DelimitedSplit8K 1000 64 30 40 1.51600 2225 2302 2376
113 DelimitedSplit8K_NULLS 1000 64 30 40 12.73300 2225 2302 2376
114 DelimitedSplit8K_NULLS 1000 64 30 40 12.53000 2225 2302 2376
115 DelimitedSplit8K 1000 128 30 40 2.85600 4487 4604 4706
116 DelimitedSplit8K_NULLS 1000 128 30 40 24.68000 4487 4604 4706
117 DelimitedSplit8K_NULLS 1000 128 30 40 24.94300 4487 4604 4706
118 DelimitedSplit8K 1000 210 30 40 4.99600 7378 7558 7705
119 DelimitedSplit8K_NULLS 1000 210 30 40 41.28300 7378 7558 7705
120 DelimitedSplit8K_NULLS 1000 210 30 40 41.99300 7378 7558 7705
121 DelimitedSplit8K 1000 1 40 50 0.01300 40 44 50
122 DelimitedSplit8K_NULLS 1000 1 40 50 0.33300 40 44 50
123 DelimitedSplit8K_NULLS 1000 1 40 50 0.34000 40 44 50
124 DelimitedSplit8K 1000 2 40 50 0.03000 81 91 101
125 DelimitedSplit8K_NULLS 1000 2 40 50 0.65600 81 91 101
126 DelimitedSplit8K_NULLS 1000 2 40 50 0.68000 81 91 101
127 DelimitedSplit8K 1000 4 40 50 0.11600 166 183 199
128 DelimitedSplit8K_NULLS 1000 4 40 50 1.19600 166 183 199
129 DelimitedSplit8K_NULLS 1000 4 40 50 1.22600 166 183 199
130 DelimitedSplit8K 1000 8 40 50 0.22300 341 367 392
131 DelimitedSplit8K_NULLS 1000 8 40 50 2.44000 341 367 392
132 DelimitedSplit8K_NULLS 1000 8 40 50 2.31300 341 367 392
133 DelimitedSplit8K 1000 16 40 50 0.47600 700 735 777
134 DelimitedSplit8K_NULLS 1000 16 40 50 4.64300 700 735 777
135 DelimitedSplit8K_NULLS 1000 16 40 50 4.69600 700 735 777
136 DelimitedSplit8K 1000 32 40 50 0.91000 1401 1471 1531
137 DelimitedSplit8K_NULLS 1000 32 40 50 9.08000 1401 1471 1531
138 DelimitedSplit8K_NULLS 1000 32 40 50 9.06600 1401 1471 1531
139 DelimitedSplit8K 1000 64 40 50 1.77600 2860 2945 3026
140 DelimitedSplit8K_NULLS 1000 64 40 50 17.82300 2860 2945 3026
141 DelimitedSplit8K_NULLS 1000 64 40 50 17.84600 2860 2945 3026
142 DelimitedSplit8K 1000 128 40 50 3.54300 5785 5887 5997
143 DelimitedSplit8K_NULLS 1000 128 40 50 35.12000 5785 5887 5997
144 DelimitedSplit8K_NULLS 1000 128 40 50 36.19600 5785 5887 5997
145 DelimitedSplit8K 1000 165 40 50 4.55600 7473 7589 7743
146 DelimitedSplit8K_NULLS 1000 165 40 50 45.69600 7473 7589 7743
147 DelimitedSplit8K_NULLS 1000 165 40 50 45.45300 7473 7589 7743
Here's the revised function I tested. I won't claim that it is the best variation--just a quick-and-dirty trial run for comparison purposes.
/*
--Some test strings for manual testing
SELECT * FROM dbo.DelimitedSplit8K_NULLS(' ',' ',0)
SELECT * FROM dbo.DelimitedSplit8K_NULLS('',' ',0)
SELECT * FROM dbo.DelimitedSplit8K_NULLS(NULL,' ',0)
SELECT * FROM dbo.DelimitedSplit8K_NULLS(' ',',',0)
SELECT * FROM dbo.DelimitedSplit8K_NULLS('',',',0)
SELECT * FROM dbo.DelimitedSplit8K_NULLS(NULL,',',0)
SELECT * FROM dbo.DelimitedSplit8K_NULLS('123 467 789',' ',0)
SELECT * FROM dbo.DelimitedSplit8K_NULLS('123 467 789',' ',1)
SELECT * FROM dbo.DelimitedSplit8K_NULLS('123 467 789',' ',1)
SELECT * FROM dbo.DelimitedSplit8K_NULLS('123 467 789',' ',0)
SELECT * FROM dbo.DelimitedSplit8K_NULLS('123, 467, 789',',',0)
SELECT * FROM dbo.DelimitedSplit8K_NULLS('123, 467, 789',',',1)
SELECT * FROM dbo.DelimitedSplit8K_NULLS('123,467,789',',',0)
SELECT * FROM dbo.DelimitedSplit8K_NULLS('123,467,789',',',1)
*/
CREATE FUNCTION [dbo].[DelimitedSplit8K_NULLS]
(
@pString VARCHAR(8000)
,@pDelimiter CHAR(1)
,@bRemoveSpaceAfterDelimiter BIT
)
RETURNS @Split TABLE
(
[ItemNumber] INT
,[Item] VARCHAR(50)
)
WITH SCHEMABINDING
AS
BEGIN
SET @pString = NULLIF(RTRIM(LTRIM(@pString)),'')
IF @bRemoveSpaceAfterDelimiter = 1
SET @pString = REPLACE(@pString,@pDelimiter+' ',@pDelimiter)
IF @pString IS NULL
RETURN
ELSE
WITH E1(N)
AS (
SELECT
1
UNION ALL
SELECT
1
UNION ALL
SELECT
1
UNION ALL
SELECT
1
UNION ALL
SELECT
1
UNION ALL
SELECT
1
UNION ALL
SELECT
1
UNION ALL
SELECT
1
UNION ALL
SELECT
1
UNION ALL
SELECT
1
), --10E+1 or 10 rows
E2(N)
AS (
SELECT
1
FROM
E1 a
,E1 b
), --10E+2 or 100 rows
E4(N)
AS (
SELECT
1
FROM
E2 a
,E2 b
), --10E+4 or 10,000 rows max
cteTally(N)
AS (
SELECT TOP (ISNULL(DATALENGTH(@pString),0))
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM E4
),
cteStart(N1)
AS (
SELECT
1
UNION ALL
SELECT
t.N + 1
FROM
cteTally t
WHERE
SUBSTRING(@pString,t.N,1) = @pDelimiter
),
cteLen(N1,L1)
AS (
SELECT
s.N1
,ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0) - s.N1,8000)
FROM
cteStart s
)
INSERT INTO @Split
SELECT
ItemNumber = ROW_NUMBER() OVER (ORDER BY l.N1)
,Item = SUBSTRING(@pString,l.N1,l.L1)
FROM
cteLen l
WHERE
NULLIF(RTRIM(LTRIM(SUBSTRING(@pString,l.N1,l.L1))),'') IS NOT NULL
RETURN
END
January 16, 2013 at 1:40 pm
Due to the additional checking for nulls/blanks and removing blank rows from the returned table I figured efficiency would be seriously compromised. So I tested this revision against the original and it seriously under performs. In the results below the first run of DelimitedSplit8k_NULLS sets the parameter "@bRemoveSpacesAfterDelimiter" to 0 and the 2nd run set it to 1.
Steve,
I came up with the same results after doing almost exactly the same thing.
In fact, I found that even after taking out the cleaning of the duplicate characters and check for null, the function with the return table declaration is a lot slower than the original single statement function. I wonder if SQL Server is using a different memory management scheme.
I guess having the WHERE clause in the CROSS APPLY do the work gives the next person looking at my code an idea of what I am trying to do.
Either way it has been a fun afternoon of testing the assumptions and changes. This is a perfect example of why I still love programming after over 20 years of doing this stuff. Going through Jeff's code line by line was quite a revelation in how to use CTE, NULLIF and "Tally" tables in ways I never thought of.
Anton
January 16, 2013 at 9:38 pm
Once you switch from an iTVF to a multi-statement TVF you are dead meat and the Fat Lady is belting a tune. You basically get a cursor under the covers because the mTVF cannot be truly in-lined with the query/queryplan. The optimization process is shot to hell too (although that won't come into play here). Voids parallelism also IIRC.
Try doing a statement-level completion profiler trace while running each type of code. CAUTION: beware statement completed profiling on production systems!
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
January 16, 2013 at 11:02 pm
TheSQLGuru (1/16/2013)
Once you switch from an iTVF to a multi-statement TVF you are dead meat and the Fat Lady is belting a tune. You basically get a cursor under the covers
What was startling for me was not that functions acted like an implicit cursor (I'd always known that) but rather that you could create inline functions that don't!
January 17, 2013 at 3:01 am
aochss (1/16/2013)
The performance of this is amazing, as is the thought behind the code. These kind of posts really make you think and learn a whole new set of features in SQLThanks Jeff.
A couple of questions (sorry if it has been addressed - I didn't see it mentioned in the thread):
I would not expect a record back if the string being sent in is NULL, a single space or all spaces. In my case, I have records where the string being split is often blank and I would want no records coming back from the CROSS APPLY. I could filter them out using in the WHERE clause.
The best indeed would be to filter them out in the WHERE clause with something like WHERE LEN(StringColumn) > 0. Hence, the function calls would be for only those rows which are neither NULL nor containing only space/s.
But this can also be done in the function itself as I have tried to do it.
Also, some of my "not so good" data also had multiple spaces in between the values causing numerous records returned that were blank.For example:
dbo.DelimitedSplit8k('123 467 789', ' ') -- There are two spaces between 467 and 789
ItemNumber Item
1 123
2 467
3 <-- Empty String
4 789
Thanks Again,
Anton
Here is my attempt while keeping the function iTVF (I did not test it thoroughly). Only slight changes are done
CREATE FUNCTION [dbo].[DelimitedSplit8K_With_NULL_ZeroLengthString_And_ConsecutiveDelimiter_Filter]
--===== Define I/O parameters
(@pString VARCHAR(8000), @pDelimiter VARCHAR(1))
--WARNING!!! DO NOT USE MAX DATA-TYPES HERE! IT WILL KILL PERFORMANCE!
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...
-- enough to cover VARCHAR(8000)
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
/*========= USE LEN() FUNCTION TO CHECK WHETHER STRING IS NULL OR CONTAIN ONLY SPACES =======*/
WHERE LEN(@pString) > 0 --FILTER OUT NON_REQUIRED
/*===========================================================================================*/
),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT 1
/*========= USE LEN() FUNCTION TO CHECK WHETHER STRING IS NULL OR CONTAIN ONLY SPACES =======*/
WHERE LEN(@pString) > 0 --FILTER OUT NON_REQUIRED
/*===========================================================================================*/
UNION ALL
SELECT t.N+1 FROM cteTally t
WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter
/*====== --FILTER OUT TWO CONSECUTIVE OCCURENCES OF DELIMITERS ============*/
AND SUBSTRING(@pString,t.N+1,1) <> @pDelimiter
/*=========================================================================*/
),
cteLen(N1,L1) AS(--==== Return start and length (for use in substring)
SELECT s.N1,
ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000)
FROM cteStart s
)
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
Item = RTRIM(LTRIM(SUBSTRING(@pString, l.N1, l.L1)))--USE LTRIM AND RTRIM FUNCTION TO OMIT TRAILING SPACES
FROM cteLen l
;
January 17, 2013 at 3:14 am
TheSQLGuru (1/16/2013)
Once you switch from an iTVF to a multi-statement TVF you are dead meat and the Fat Lady is belting a tune.
Not always true 🙂 It really depends upon several things. For instance, if we have to deal with VARCHAR(max) strings and multiple rows, the mTVF version of the tally table splitter can beat the iTVF version quite comprehensively. Sometimes, mTVF versions can show their worth too 😉
I am not trying to be infamous but sometimes RBAR approach could be handy as well :-D. I hope my Anti-RBAR membership is not cancelled for issuing this statement :hehe:
January 17, 2013 at 3:35 am
Usman Butt (1/17/2013)
TheSQLGuru (1/16/2013)
Once you switch from an iTVF to a multi-statement TVF you are dead meat and the Fat Lady is belting a tune.Not always true 🙂 It really depends upon several things. For instance, if we have to deal with VARCHAR(max) strings and multiple rows, the mTVF version of the tally table splitter can beat the iTVF version quite comprehensively. Sometimes, mTVF versions can show their worth too 😉
I am not trying to be infamous but sometimes RBAR approach could be handy as well :-D. I hope my Anti-RBAR membership is not cancelled for issuing this statement :hehe:
Multi-line TVF work well when they are called/invoked sparsly or only once as say the logical starting table in your from clause, before other joins take place. They can also work quite well as part of an additional condition in a join. If it is not the only condition and applied using OR logic, it only needs to be invoked when other simpler conditions do not match.
mTVF also allow some control over the indexing of the result table (trough PK and unique constraints). This can speed up certain operations in complex processing as the required indexing on the datamodel might not be there and/or too costly to implement for all records.
What I generaly do is write first as an iTVF and if for some reason it turns out unexpectedly slow or impractical, make a mTVF from it as well with proper indexing to speed up the propblem operation.
January 17, 2013 at 9:49 am
I have no problem whatsoever in using the right tool for the job!! In fact, I make a fair bit of my consulting dollars cleaning up messes from clients who do NOT do that! 😎 Sometimes being "dead meat" IS the better alternative. :w00t:
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
February 11, 2013 at 12:55 pm
Hello,
Greatt article. I used the function and works great. I have a situation where I need to search for more than 1 character, then split the data. How could I use this function to search for 2 characters. For example, I want to search for semicolon, or, comma (;,), then, split the column if either is found.
February 11, 2013 at 12:59 pm
How can I use this function to search for more than 1 character? I have data that is delimited by either a comma or semicolon (,;). Need to split contents based on either character.
February 11, 2013 at 1:03 pm
ahpitre (2/11/2013)
How can I use this function to search for more than 1 character? I have data that is delimited by either a comma or semicolon (,;). Need to split contents based on either character.
Easiest way I can think of would be to simply use replace.
select *
from YourTable
cross apply dbo.DelimitedSplit8k(replace(YourColumn, ';', ','), ',')
This way you are still splitting on commas but it will work for either character. 😉
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
February 11, 2013 at 3:26 pm
ahpitre (2/11/2013)
How can I use this function to search for more than 1 character? I have data that is delimited by either a comma or semicolon (,;). Need to split contents based on either character.
Just throw in a replace to change the second character into the desired delimiter. I can't guess what effect this would have on performance of the function.
If you are talking about a two-dimensional array like 'A,B;X,Y;D,E' and you want to split that into two columns, then that's a different problem.
DECLARE @strExample VARCHAR(8000)
SET @strExample = 'A,B;C;D,E,F,G,H'
SELECT
ItemNumber
,Item
FROM
dbo.DelimitedSplit8K(REPLACE(@strExample,';',','),',') AS dsk
February 11, 2013 at 3:28 pm
Sorry Sean...I didn't see that you had already answered with exactly the same idea. 😛
February 11, 2013 at 3:39 pm
Steven Willis (2/11/2013)
Sorry Sean...I didn't see that you had already answered with exactly the same idea. 😛
No worries. Great minds...
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
February 11, 2013 at 5:13 pm
If you have two delimiters you can change the code slightly:
DECLARE @pString varchar(8000) = '55555;4444,333,22,1';
DECLARE @pDelimiter1 char(1) = ',';
DECLARE @pDelimiter2 char(2) = ';';
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT 1 UNION ALL
SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(@pString,t.N,1) in (@pDelimiter1, @pDelimiter2)
),
cteLen(N1,L1) AS(--==== Return start and length (for use in substring)
SELECT s.N1,
Case when ISNULL(NULLIF(CHARINDEX(@pDelimiter1,@pString,s.N1),0)-s.N1,8000)
< ISNULL(NULLIF(CHARINDEX(@pDelimiter2,@pString,s.N1),0)-s.N1,8000)
then ISNULL(NULLIF(CHARINDEX(@pDelimiter1,@pString,s.N1),0)-s.N1,8000)
else ISNULL(NULLIF(CHARINDEX(@pDelimiter2,@pString,s.N1),0)-s.N1,8000)
end
FROM cteStart s
)
select * from cteLen;
Performance of the second charindex might be similar to using replace on the input string. An alternative would be do join cteStart to itself to derive cteLen. For a single delimiter it is slower than charindex but for multiples it might be better. Certainly for more than 2 delimiters the self join would be simpler code to read.
Viewing 15 posts - 481 through 495 (of 981 total)
You must be logged in to reply to this topic. Login to reply