Shred Me This

Question

Post reply

Shred Me This

Dwain Camps

SSC Guru

Points: 86913
More actions
May 23, 2012 at 7:37 pm

#261480

500+ posts and I finally get to start my second thread! Hopefully I'll get a better response on this one that I did on my first.

Fumbling around as I am wont to do with XML the other day, I ran across some XML that I simply couldn't shred the way I wanted to.

Here's the setup data:

DECLARE @t TABLE (ID INT IDENTITY, key1 CHAR(10), val1 VARCHAR(50) ,val2 VARCHAR(50), val3 VARCHAR(50)) INSERT INTO @t SELECT 'REC01', 'VALUE 01A', 'VALUE 02A', 'VALUE 03A' UNION ALL SELECT 'REC02', 'VALUE 01B', 'VALUE 02B', 'VALUE 03B' UNION ALL SELECT 'REC03', 'VALUE 01C', 'VALUE 02C', 'VALUE 03C' UNION ALL SELECT 'REC04', 'VALUE 01D', 'VALUE 02D', 'VALUE 03D' DECLARE @xml1 XML, @xml2 XML SELECT @xml1 = (SELECT * FROM @t FOR XML PATH('')) SELECT @xml1

This produces XML that looks as follows:

<ID>1</ID> <key1>REC01 </key1> <val1>VALUE 01A</val1> <val2>VALUE 02A</val2> <val3>VALUE 03A</val3> <ID>2</ID> <key1>REC02 </key1> <val1>VALUE 01B</val1> <val2>VALUE 02B</val2> <val3>VALUE 03B</val3> <ID>3</ID> <key1>REC03 </key1> <val1>VALUE 01C</val1> <val2>VALUE 02C</val2> <val3>VALUE 03C</val3> <ID>4</ID> <key1>REC04 </key1> <val1>VALUE 01D</val1> <val2>VALUE 02D</val2> <val3>VALUE 03D</val3>

I am able to extract any of the records individually as follows:

-- Read first record SELECT row.value('ID[1]', 'INT') as ID ,row.value('key1[1]', 'CHAR(10)') as key1 ,row.value('val1[1]', 'VARCHAR(50)') as var1 ,row.value('val2[1]', 'VARCHAR(50)') as var2 ,row.value('val3[1]', 'VARCHAR(50)') as var3 FROM (SELECT @xml1 AS XML1) x CROSS APPLY XML1.nodes('.') t(row) -- Read second record SELECT row.value('ID[2]', 'INT') as ID ,row.value('key1[2]', 'CHAR(10)') as key1 ,row.value('val1[2]', 'VARCHAR(50)') as var1 ,row.value('val2[2]', 'VARCHAR(50)') as var2 ,row.value('val3[2]', 'VARCHAR(50)') as var3 FROM (SELECT @xml1 AS XML1) x CROSS APPLY XML1.nodes('.') t(row)

I am not able to extract all 4 records into my result set. Of course, I can if I format the XML differently, for example like this:

SELECT @xml2 = (SELECT * FROM @t FOR XML PATH('IDS')) --SELECT @xml2 = CAST('<IDS>' + CAST(@xml1 AS VARCHAR(MAX)) + '</IDS>' AS XML) SELECT @xml2 SELECT row.value('ID[1]', 'INT') as ID ,row.value('key1[1]', 'CHAR(10)') as key1 ,row.value('val1[1]', 'VARCHAR(50)') as var1 ,row.value('val2[1]', 'VARCHAR(50)') as var2 ,row.value('val3[1]', 'VARCHAR(50)') as var3 FROM (SELECT @xml2 AS XML2) x CROSS APPLY XML2.nodes('/IDS') t(row)

Any XML experts out there with a clue to this one?

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply

Mark Cowne One Orange Chip Points: 26865 More actions · Answer 1

Try this, I can't see a simpler way with the XML as is.



WITH CTE AS (

SELECT row.value('local-name(.)', 'VARCHAR(50)') AS name

      ,row.value('text()[1]', 'VARCHAR(50)') AS val

      ,ROW_NUMBER() OVER(ORDER BY t.row) AS rn1

      ,ROW_NUMBER() OVER(PARTITION BY CASE WHEN row.value('local-name(.)', 'VARCHAR(50)')='ID' THEN 1 ELSE 0 END ORDER BY t.row) AS rn2

FROM (SELECT @xml1 AS XML1) x

CROSS APPLY XML1.nodes('/*') t(row))

SELECT MAX(CASE WHEN name='ID' THEN CAST(val AS INT) END) AS ID,

       MAX(CASE WHEN name='key1' THEN CAST(val AS CHAR(10)) END) AS key1,

       MAX(CASE WHEN name='val1' THEN CAST(val AS VARCHAR(50)) END) AS val1,

       MAX(CASE WHEN name='val2' THEN CAST(val AS VARCHAR(50)) END) AS val2,

       MAX(CASE WHEN name='val3' THEN CAST(val AS VARCHAR(50)) END) AS val3

FROM CTE

GROUP BY CASE WHEN name='ID' THEN rn2 ELSE rn1-rn2 END;

Edit : Fixed to handle case when group does not have all elements present.

____________________________________________________

Deja View - The strange feeling that somewhere, sometime you've optimised this query before

How to get the best help on a forum

http://www.sqlservercentral.com/articles/Best+Practices/61537

Paul White SSC Guru Points: 150457 More actions · Answer 2

Are you able to alter the query to produce well-formed XML?



DECLARE @t TABLE

(

    ID INT IDENTITY,

    key1 CHAR(10),

    val1 VARCHAR(50),

	val2 VARCHAR(50),

    val3 VARCHAR(50)

);

INSERT INTO @t

SELECT 'REC01', 'VALUE 01A', 'VALUE 02A', 'VALUE 03A'

UNION ALL SELECT 'REC02', 'VALUE 01B', 'VALUE 02B', 'VALUE 03B'

UNION ALL SELECT 'REC03', 'VALUE 01C', 'VALUE 02C', 'VALUE 03C'

UNION ALL SELECT 'REC04', 'VALUE 01D', 'VALUE 02D', 'VALUE 03D'

DECLARE @xml1 xml = (SELECT * FROM @t FOR XML PATH('record'), ROOT('root'))

SELECT @xml1

SELECT

    ID = t1.n.value('./ID[1]', 'integer'),

    key1 = t1.n.value('./key1[1]', 'char(10)'),

    val1 = t1.n.value('./val1[1]', 'varchar(50)'),

    val2 = t1.n.value('./val2[1]', 'varchar(50)'),

    val3 = t1.n.value('./val3[1]', 'varchar(50)')

FROM @xml1.nodes('./root/record') AS t1 (n);

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

Dwain Camps SSC Guru Points: 86913 More actions · Answer 3

Thanks Mark and Paul for the suggestions.

The case I came up with is for a training session so I'm trying to illustrate a point. I understand how to shred if it has a named root node (I have other examples that cover this).

I am happy to use this to illustrate how not to form XML. I was just afraid that at my experience level, I was missing something.

Mark - I'll need to analyze your suggestion closely to figure out what you're doing. MS might as well have written the documentation in Klingon for all the good it does me.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Mark Cowne One Orange Chip Points: 26865 More actions · Answer 4

Mark Cowne

One Orange Chip

Points: 26865

May 24, 2012 at 6:01 am

#1492452

-- post deleted...

____________________________________________________

Deja View - The strange feeling that somewhere, sometime you've optimised this query before

How to get the best help on a forum

http://www.sqlservercentral.com/articles/Best+Practices/61537

R.P.Rozema SSChampion Points: 12305 More actions · Answer 5

Of course the xml should be improved on for any practical implementation, but still this challenged me. Since this is hypothetical already and on top of that Mark is in this discussion too, I thought I'd take this a little further. Mark recently introduced me to << in an xquery (or is it xpath?) expression. I have so far not been able to find what exactly it does :crazy:. I started experimenting with it for this challenge... I have now found how to identify the next ID element following each ID element in the document, but I don't know how to continue...

Here's me hoping Mark can shed some more light on what exactly this <<-operator (and apparently the opposite >> works too?) does and maybe from my experimenting others get an idea how we can get a working solution for this challenge.

What I've come up with finds for each ID-element the next ID element in the xml, i.e. all values for this ID must be in the elements in between these two elements. I don't know how to continue from here, but maybe someone else can help?

DECLARE @t TABLE



	(ID INT IDENTITY, key1 CHAR(10), val1 VARCHAR(50)

	,val2 VARCHAR(50), val3 VARCHAR(50))

INSERT INTO @t

SELECT 'REC01', 'VALUE 01A', 'VALUE 02A', 'VALUE 03A'

UNION ALL SELECT 'REC02', 'VALUE 01B', 'VALUE 02B', 'VALUE 03B'

UNION ALL SELECT 'REC03', 'VALUE 01C', 'VALUE 02C', 'VALUE 03C'

UNION ALL SELECT 'REC04', 'VALUE 01D', 'VALUE 02D', 'VALUE 03D'

DECLARE @xml1 XML, @xml2 XML

SELECT @xml1 = (SELECT * FROM @t FOR XML PATH(''))

SELECT @xml1

select n.n.value('.','int') as ID,

   n.n.query('for $b in . return $b/../*[. >> $b][local-name(.) = "ID"][1]')

from @xml1.nodes('/ID') n(n)

order by 1

The results so far:

ID (No column name)



1	<ID>2</ID>

2	<ID>3</ID>

3	<ID>4</ID>

4

Posting Data Etiquette - Jeff Moden[/url]
Posting Performance Based Questions - Gail Shaw[/url]
Hidden RBAR - Jeff Moden[/url]
Cross Tabs and Pivots - Jeff Moden[/url]
Catch-all queries - Gail Shaw[/url]

If you don't have time to do it right, when will you have time to do it over?

Mark Cowne One Orange Chip Points: 26865 More actions · Answer 6

R.P.Rozema (5/24/2012)

Of course the xml should be improved on for any practical implementation, but still this challenged me. Since this is hypothetical already and on top of that Mark is in this discussion too, I thought I'd take this a little further. Mark recently introduced me to << in an xquery (or is it xpath?) expression. I have so far not been able to find what exactly it does :crazy:. I started experimenting with it for this challenge... I have now found how to identify the next ID element following each ID element in the document, but I don't know how to continue...

Here's me hoping Mark can shed some more light on what exactly this <<-operator (and apparently the opposite >> works too?) does and maybe from my experimenting others get an idea how we can get a working solution for this challenge.

What I've come up with finds for each ID-element the next ID element in the xml, i.e. all values for this ID must be in the elements in between these two elements. I don't know how to continue from here, but maybe someone else can help?

DECLARE @t TABLE
(ID INT IDENTITY, key1 CHAR(10), val1 VARCHAR(50) ,val2 VARCHAR(50), val3 VARCHAR(50)) INSERT INTO @t SELECT 'REC01', 'VALUE 01A', 'VALUE 02A', 'VALUE 03A' UNION ALL SELECT 'REC02', 'VALUE 01B', 'VALUE 02B', 'VALUE 03B' UNION ALL SELECT 'REC03', 'VALUE 01C', 'VALUE 02C', 'VALUE 03C' UNION ALL SELECT 'REC04', 'VALUE 01D', 'VALUE 02D', 'VALUE 03D' DECLARE @xml1 XML, @xml2 XML SELECT @xml1 = (SELECT * FROM @t FOR XML PATH('')) SELECT @xml1 select n.n.value('.','int') as ID, n.n.query('for $b in . return $b/../*[. >> $b][local-name(.) = "ID"][1]') from @xml1.nodes('/ID') n(n) order by 1

The results so far:

ID (No column name)
1 <ID>2</ID> 2 <ID>3</ID> 3 <ID>4</ID>
4

Have a look at "Node Order Comparison Operators" here, it's a bit sketchy though.

http://msdn.microsoft.com/en-us/library/ms190935.aspx

I've updated the query I originally posted to handle the case of missing elements - it uses a difference of

ROW_NUMBERs rather than the ">>" operator which is, as you pointed out, much faster.

____________________________________________________

Deja View - The strange feeling that somewhere, sometime you've optimised this query before

How to get the best help on a forum

http://www.sqlservercentral.com/articles/Best+Practices/61537

Dwain Camps SSC Guru Points: 86913 More actions · Answer 7

Mark-101232 (5/24/2012)

Try this, I can't see a simpler way with the XML as is.

WITH CTE AS ( SELECT row.value('local-name(.)', 'VARCHAR(50)') AS name ,row.value('text()[1]', 'VARCHAR(50)') AS val ,ROW_NUMBER() OVER(ORDER BY t.row) AS rn1 ,ROW_NUMBER() OVER(PARTITION BY CASE WHEN row.value('local-name(.)', 'VARCHAR(50)')='ID' THEN 1 ELSE 0 END ORDER BY t.row) AS rn2 FROM (SELECT @xml1 AS XML1) x CROSS APPLY XML1.nodes('/*') t(row)) SELECT MAX(CASE WHEN name='ID' THEN CAST(val AS INT) END) AS ID, MAX(CASE WHEN name='key1' THEN CAST(val AS CHAR(10)) END) AS key1, MAX(CASE WHEN name='val1' THEN CAST(val AS VARCHAR(50)) END) AS val1, MAX(CASE WHEN name='val2' THEN CAST(val AS VARCHAR(50)) END) AS val2, MAX(CASE WHEN name='val3' THEN CAST(val AS VARCHAR(50)) END) AS val3 FROM CTE GROUP BY CASE WHEN name='ID' THEN rn2 ELSE rn1-rn2 END;

Edit : Fixed to handle case when group does not have all elements present.

This is a very interesting solution and I'm only beginning to understand how it works. In my effort to do so, I found another alternative based on it that also works.



;WITH CTE AS (

SELECT row.value('local-name(.)', 'VARCHAR(50)') AS name

      ,row.value('text()[1]', 'VARCHAR(50)') AS val

      ,ROW_NUMBER() OVER(ORDER BY t.row) AS rn1

      ,ROW_NUMBER() OVER(PARTITION BY CASE WHEN row.value('local-name(.)', 'VARCHAR(50)')='ID' THEN 1 ELSE 0 END ORDER BY t.row) AS rn2

FROM (SELECT @xml1 AS XML1) x

CROSS APPLY XML1.nodes('/*') t(row))

SELECT ID=MAX(CAST(ID AS INT)), key1=MAX(CAST(key1 AS CHAR(10)))

	, val1=MAX(CAST(val1 AS VARCHAR(50))), val2=MAX(CAST(val2 AS VARCHAR(50)))

	, val3=MAX(CAST(val3 AS VARCHAR(50)))

FROM CTE

PIVOT (MAX(val) FOR name IN ([ID], [key1], [val1], [val2], [val3])) pt

GROUP BY CASE WHEN ID IS NOT NULL THEN rn2 ELSE rn1-rn2 END

I will continue to research until the light bulb comes on.:hehe:

Thanks again!

Edit: Corrected to include CASTing of results.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Dwain Camps SSC Guru Points: 86913 More actions · Answer 8

Mark-101232 (5/24/2012)

R.P.Rozema (5/24/2012)

Of course the xml should be improved on for any practical implementation, but still this challenged me. Since this is hypothetical already and on top of that Mark is in this discussion too, I thought I'd take this a little further. Mark recently introduced me to << in an xquery (or is it xpath?) expression. I have so far not been able to find what exactly it does :crazy:. I started experimenting with it for this challenge... I have now found how to identify the next ID element following each ID element in the document, but I don't know how to continue...

Here's me hoping Mark can shed some more light on what exactly this <<-operator (and apparently the opposite >> works too?) does and maybe from my experimenting others get an idea how we can get a working solution for this challenge.

What I've come up with finds for each ID-element the next ID element in the xml, i.e. all values for this ID must be in the elements in between these two elements. I don't know how to continue from here, but maybe someone else can help?

DECLARE @t TABLE
(ID INT IDENTITY, key1 CHAR(10), val1 VARCHAR(50) ,val2 VARCHAR(50), val3 VARCHAR(50)) INSERT INTO @t SELECT 'REC01', 'VALUE 01A', 'VALUE 02A', 'VALUE 03A' UNION ALL SELECT 'REC02', 'VALUE 01B', 'VALUE 02B', 'VALUE 03B' UNION ALL SELECT 'REC03', 'VALUE 01C', 'VALUE 02C', 'VALUE 03C' UNION ALL SELECT 'REC04', 'VALUE 01D', 'VALUE 02D', 'VALUE 03D' DECLARE @xml1 XML, @xml2 XML SELECT @xml1 = (SELECT * FROM @t FOR XML PATH('')) SELECT @xml1 select n.n.value('.','int') as ID, n.n.query('for $b in . return $b/../*[. >> $b][local-name(.) = "ID"][1]') from @xml1.nodes('/ID') n(n) order by 1

The results so far:

ID (No column name)
1 <ID>2</ID> 2 <ID>3</ID> 3 <ID>4</ID>
4

Have a look at "Node Order Comparison Operators" here, it's a bit sketchy though.

http://msdn.microsoft.com/en-us/library/ms190935.aspx

I've updated the query I originally posted to handle the case of missing elements - it uses a difference of

ROW_NUMBERs rather than the ">>" operator which is, as you pointed out, much faster.

This suggestion gives me an interesting idea which I am clueless how to act on (haven't read the link yet but I will).

Would it be possible to store each element group in a row (with ID in a column as in the above result set), with a second XML column that contains all of the XML for that element group?

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Dwain Camps SSC Guru Points: 86913 More actions · Answer 9

dwain.c (5/24/2012)

This suggestion gives me an interesting idea which I am clueless how to act on (haven't read the link yet but I will).

Would it be possible to store each element group in a row (with ID in a column as in the above result set), with a second XML column that contains all of the XML for that element group?

Yes indeed, this can be done.



SELECT row.value('ID[1]', 'INT') as ID

       ,row.value('key1[1]', 'CHAR(10)') as key1

       ,row.value('val1[1]', 'VARCHAR(50)') as var1

       ,row.value('val2[1]', 'VARCHAR(50)') as var2

       ,row.value('val3[1]', 'VARCHAR(50)') as var3

FROM (

	SELECT CAST('<ID>' + strcol AS XML) AS xml1

	FROM (SELECT CAST(@xml1 AS VARCHAR(MAX))) AS x(MyXML)

	CROSS APPLY dbo.SplitString('<ID>', SUBSTRING(MyXML, 5, LEN(MyXML)))) x

CROSS APPLY XML1.nodes('.') t(row)

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

R.P.Rozema SSChampion Points: 12305 More actions · Answer 10

dwain.c (5/24/2012)

Mark-101232 (5/24/2012)

Try this, I can't see a simpler way with the XML as is.

WITH CTE AS ( SELECT row.value('local-name(.)', 'VARCHAR(50)') AS name ,row.value('text()[1]', 'VARCHAR(50)') AS val ,ROW_NUMBER() OVER(ORDER BY t.row) AS rn1 ,ROW_NUMBER() OVER(PARTITION BY CASE WHEN row.value('local-name(.)', 'VARCHAR(50)')='ID' THEN 1 ELSE 0 END ORDER BY t.row) AS rn2 FROM (SELECT @xml1 AS XML1) x CROSS APPLY XML1.nodes('/*') t(row)) SELECT MAX(CASE WHEN name='ID' THEN CAST(val AS INT) END) AS ID, MAX(CASE WHEN name='key1' THEN CAST(val AS CHAR(10)) END) AS key1, MAX(CASE WHEN name='val1' THEN CAST(val AS VARCHAR(50)) END) AS val1, MAX(CASE WHEN name='val2' THEN CAST(val AS VARCHAR(50)) END) AS val2, MAX(CASE WHEN name='val3' THEN CAST(val AS VARCHAR(50)) END) AS val3 FROM CTE GROUP BY CASE WHEN name='ID' THEN rn2 ELSE rn1-rn2 END;

Edit : Fixed to handle case when group does not have all elements present.

This is a very interesting solution and I'm only beginning to understand how it works. In my effort to do so, I found another alternative based on it that also works.

;WITH CTE AS ( SELECT row.value('local-name(.)', 'VARCHAR(50)') AS name ,row.value('text()[1]', 'VARCHAR(50)') AS val ,ROW_NUMBER() OVER(ORDER BY t.row) AS rn1 ,ROW_NUMBER() OVER(PARTITION BY CASE WHEN row.value('local-name(.)', 'VARCHAR(50)')='ID' THEN 1 ELSE 0 END ORDER BY t.row) AS rn2 FROM (SELECT @xml1 AS XML1) x CROSS APPLY XML1.nodes('/*') t(row)) SELECT ID=MAX(CAST(ID AS INT)), key1=MAX(CAST(key1 AS CHAR(10))) , val1=MAX(CAST(val1 AS VARCHAR(50))), val2=MAX(CAST(val2 AS VARCHAR(50))) , val3=MAX(CAST(val3 AS VARCHAR(50))) FROM CTE PIVOT (MAX(val) FOR name IN ([ID], [key1], [val1], [val2], [val3])) pt GROUP BY CASE WHEN ID IS NOT NULL THEN rn2 ELSE rn1-rn2 END

I will continue to research until the light bulb comes on.:hehe:

Thanks again!

Edit: Corrected to include CASTing of results.

The difference is that Mark used a cross-tab, where you turned to a pivot. Cross tabs are most of the times faster than pivots because it lacks one operation applied to each value by pivot. Cross tab syntax is also considered better readable by most people, but the most important advantage is that cross tabs have a much broader field of problems they can be applied to than pivot. Jeff wrote a very good article comparing the two: http://qa.sqlservercentral.com/articles/T-SQL/63681/.

Posting Data Etiquette - Jeff Moden[/url]
Posting Performance Based Questions - Gail Shaw[/url]
Hidden RBAR - Jeff Moden[/url]
Cross Tabs and Pivots - Jeff Moden[/url]
Catch-all queries - Gail Shaw[/url]

If you don't have time to do it right, when will you have time to do it over?

Dwain Camps SSC Guru Points: 86913 More actions · Answer 11

Yet another method that works but is ugly and probably not recommended. I like my previous one better.



SELECT @xml1 = (SELECT * FROM @t FOR XML PATH(''))

-- Add in row tags

SELECT row.value('ID[1]', 'INT') as ID

       ,row.value('key1[1]', 'CHAR(10)') as key1

       ,row.value('val1[1]', 'VARCHAR(50)') as var1

       ,row.value('val2[1]', 'VARCHAR(50)') as var2

       ,row.value('val3[1]', 'VARCHAR(50)') as var3

FROM (

	SELECT CAST(REPLACE(Tag1, '</val3>', '</val3></row>') AS XML) AS XML1

	FROM (

		SELECT Tag1 FROM (SELECT CAST(@xml1 AS VARCHAR(MAX))) z(xml1)

		CROSS APPLY (SELECT REPLACE(xml1, '<ID>', '<row><ID>')) y(Tag1)

	) x ) z

CROSS APPLY XML1.nodes('row') t(row)

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Dwain Camps SSC Guru Points: 86913 More actions · Answer 12

R.P.Rozema (5/25/2012)

The difference is that Mark used a cross-tab, where you turned to a pivot. Cross tabs are most of the times faster than pivots because it lacks one operation applied to each value by pivot. Cross tab syntax is also considered better readable by most people, but the most important advantage is that cross tabs have a much broader field of problems they can be applied to than pivot. Jeff wrote a very good article comparing the two: http://qa.sqlservercentral.com/articles/T-SQL/63681/.

It's a matter of taste I suppose. Not sure either one is very readable and I've heard the same about performance of PIVOTs (hence I rarely use them). I've seen the link you provided but thanks nonetheless.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Mark Cowne One Orange Chip Points: 26865 More actions · Answer 13

dwain.c (5/24/2012)

Mark-101232 (5/24/2012)

Try this, I can't see a simpler way with the XML as is.

WITH CTE AS ( SELECT row.value('local-name(.)', 'VARCHAR(50)') AS name ,row.value('text()[1]', 'VARCHAR(50)') AS val ,ROW_NUMBER() OVER(ORDER BY t.row) AS rn1 ,ROW_NUMBER() OVER(PARTITION BY CASE WHEN row.value('local-name(.)', 'VARCHAR(50)')='ID' THEN 1 ELSE 0 END ORDER BY t.row) AS rn2 FROM (SELECT @xml1 AS XML1) x CROSS APPLY XML1.nodes('/*') t(row)) SELECT MAX(CASE WHEN name='ID' THEN CAST(val AS INT) END) AS ID, MAX(CASE WHEN name='key1' THEN CAST(val AS CHAR(10)) END) AS key1, MAX(CASE WHEN name='val1' THEN CAST(val AS VARCHAR(50)) END) AS val1, MAX(CASE WHEN name='val2' THEN CAST(val AS VARCHAR(50)) END) AS val2, MAX(CASE WHEN name='val3' THEN CAST(val AS VARCHAR(50)) END) AS val3 FROM CTE GROUP BY CASE WHEN name='ID' THEN rn2 ELSE rn1-rn2 END;

Edit : Fixed to handle case when group does not have all elements present.

This is a very interesting solution and I'm only beginning to understand how it works. In my effort to do so, I found another alternative based on it that also works.

;WITH CTE AS ( SELECT row.value('local-name(.)', 'VARCHAR(50)') AS name ,row.value('text()[1]', 'VARCHAR(50)') AS val ,ROW_NUMBER() OVER(ORDER BY t.row) AS rn1 ,ROW_NUMBER() OVER(PARTITION BY CASE WHEN row.value('local-name(.)', 'VARCHAR(50)')='ID' THEN 1 ELSE 0 END ORDER BY t.row) AS rn2 FROM (SELECT @xml1 AS XML1) x CROSS APPLY XML1.nodes('/*') t(row)) SELECT ID=MAX(CAST(ID AS INT)), key1=MAX(CAST(key1 AS CHAR(10))) , val1=MAX(CAST(val1 AS VARCHAR(50))), val2=MAX(CAST(val2 AS VARCHAR(50))) , val3=MAX(CAST(val3 AS VARCHAR(50))) FROM CTE PIVOT (MAX(val) FOR name IN ([ID], [key1], [val1], [val2], [val3])) pt GROUP BY CASE WHEN ID IS NOT NULL THEN rn2 ELSE rn1-rn2 END

I will continue to research until the light bulb comes on.:hehe:

Thanks again!

Edit: Corrected to include CASTing of results.

It's a variation on the "Gaps and Islands" technique, see Jeff Moden's excellent article here

http://qa.sqlservercentral.com/articles/T-SQL/71550/

Have a looks at the results of the query below, it should make things clearer.

WITH CTE AS (



SELECT row.value('local-name(.)', 'VARCHAR(50)') AS name

      ,row.value('text()[1]', 'VARCHAR(50)') AS val

      ,ROW_NUMBER() OVER(ORDER BY t.row) AS rn1

      ,ROW_NUMBER() OVER(PARTITION BY CASE WHEN row.value('local-name(.)', 'VARCHAR(50)')='ID' THEN 1 ELSE 0 END ORDER BY t.row) AS rn2

FROM (SELECT @xml1 AS XML1) x

CROSS APPLY XML1.nodes('/*') t(row))

SELECT name,

       val,

       rn1,

       rn2,

       CASE WHEN name='ID' THEN rn2 END AS IDrn,

       CASE WHEN name<>'ID' THEN rn1-rn2 END AS NotIDrn,

       CASE WHEN name='ID' THEN rn2 ELSE rn1-rn2 END AS GroupByThis

FROM CTE

ORDER BY rn1;

____________________________________________________

Deja View - The strange feeling that somewhere, sometime you've optimised this query before

How to get the best help on a forum

http://www.sqlservercentral.com/articles/Best+Practices/61537

Dwain Camps SSC Guru Points: 86913 More actions · Answer 14

You have no idea what a time-saver you just posted for me. I needed to search out that very same gaps and islands article for something else I'm working on. Now I have it.

It also does help to explain the solution you provided. Thanks much!

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St