Shred Me This

  • 500+ posts and I finally get to start my second thread! Hopefully I'll get a better response on this one that I did on my first.

    Fumbling around as I am wont to do with XML the other day, I ran across some XML that I simply couldn't shred the way I wanted to.

    Here's the setup data:

    DECLARE @t TABLE

    (ID INT IDENTITY, key1 CHAR(10), val1 VARCHAR(50)

    ,val2 VARCHAR(50), val3 VARCHAR(50))

    INSERT INTO @t

    SELECT 'REC01', 'VALUE 01A', 'VALUE 02A', 'VALUE 03A'

    UNION ALL SELECT 'REC02', 'VALUE 01B', 'VALUE 02B', 'VALUE 03B'

    UNION ALL SELECT 'REC03', 'VALUE 01C', 'VALUE 02C', 'VALUE 03C'

    UNION ALL SELECT 'REC04', 'VALUE 01D', 'VALUE 02D', 'VALUE 03D'

    DECLARE @xml1 XML, @xml2 XML

    SELECT @xml1 = (SELECT * FROM @t FOR XML PATH(''))

    SELECT @xml1

    This produces XML that looks as follows:

    <ID>1</ID>

    <key1>REC01 </key1>

    <val1>VALUE 01A</val1>

    <val2>VALUE 02A</val2>

    <val3>VALUE 03A</val3>

    <ID>2</ID>

    <key1>REC02 </key1>

    <val1>VALUE 01B</val1>

    <val2>VALUE 02B</val2>

    <val3>VALUE 03B</val3>

    <ID>3</ID>

    <key1>REC03 </key1>

    <val1>VALUE 01C</val1>

    <val2>VALUE 02C</val2>

    <val3>VALUE 03C</val3>

    <ID>4</ID>

    <key1>REC04 </key1>

    <val1>VALUE 01D</val1>

    <val2>VALUE 02D</val2>

    <val3>VALUE 03D</val3>

    I am able to extract any of the records individually as follows:

    -- Read first record

    SELECT row.value('ID[1]', 'INT') as ID

    ,row.value('key1[1]', 'CHAR(10)') as key1

    ,row.value('val1[1]', 'VARCHAR(50)') as var1

    ,row.value('val2[1]', 'VARCHAR(50)') as var2

    ,row.value('val3[1]', 'VARCHAR(50)') as var3

    FROM (SELECT @xml1 AS XML1) x CROSS APPLY XML1.nodes('.') t(row)

    -- Read second record

    SELECT row.value('ID[2]', 'INT') as ID

    ,row.value('key1[2]', 'CHAR(10)') as key1

    ,row.value('val1[2]', 'VARCHAR(50)') as var1

    ,row.value('val2[2]', 'VARCHAR(50)') as var2

    ,row.value('val3[2]', 'VARCHAR(50)') as var3

    FROM (SELECT @xml1 AS XML1) x CROSS APPLY XML1.nodes('.') t(row)

    I am not able to extract all 4 records into my result set. Of course, I can if I format the XML differently, for example like this:

    SELECT @xml2 = (SELECT * FROM @t FOR XML PATH('IDS'))

    --SELECT @xml2 = CAST('<IDS>' + CAST(@xml1 AS VARCHAR(MAX)) + '</IDS>' AS XML)

    SELECT @xml2

    SELECT row.value('ID[1]', 'INT') as ID

    ,row.value('key1[1]', 'CHAR(10)') as key1

    ,row.value('val1[1]', 'VARCHAR(50)') as var1

    ,row.value('val2[1]', 'VARCHAR(50)') as var2

    ,row.value('val3[1]', 'VARCHAR(50)') as var3

    FROM (SELECT @xml2 AS XML2) x CROSS APPLY XML2.nodes('/IDS') t(row)

    Any XML experts out there with a clue to this one?


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Try this, I can't see a simpler way with the XML as is.

    WITH CTE AS (

    SELECT row.value('local-name(.)', 'VARCHAR(50)') AS name

    ,row.value('text()[1]', 'VARCHAR(50)') AS val

    ,ROW_NUMBER() OVER(ORDER BY t.row) AS rn1

    ,ROW_NUMBER() OVER(PARTITION BY CASE WHEN row.value('local-name(.)', 'VARCHAR(50)')='ID' THEN 1 ELSE 0 END ORDER BY t.row) AS rn2

    FROM (SELECT @xml1 AS XML1) x

    CROSS APPLY XML1.nodes('/*') t(row))

    SELECT MAX(CASE WHEN name='ID' THEN CAST(val AS INT) END) AS ID,

    MAX(CASE WHEN name='key1' THEN CAST(val AS CHAR(10)) END) AS key1,

    MAX(CASE WHEN name='val1' THEN CAST(val AS VARCHAR(50)) END) AS val1,

    MAX(CASE WHEN name='val2' THEN CAST(val AS VARCHAR(50)) END) AS val2,

    MAX(CASE WHEN name='val3' THEN CAST(val AS VARCHAR(50)) END) AS val3

    FROM CTE

    GROUP BY CASE WHEN name='ID' THEN rn2 ELSE rn1-rn2 END;

    Edit : Fixed to handle case when group does not have all elements present.

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • Are you able to alter the query to produce well-formed XML?

    DECLARE @t TABLE

    (

    ID INT IDENTITY,

    key1 CHAR(10),

    val1 VARCHAR(50),

    val2 VARCHAR(50),

    val3 VARCHAR(50)

    );

    INSERT INTO @t

    SELECT 'REC01', 'VALUE 01A', 'VALUE 02A', 'VALUE 03A'

    UNION ALL SELECT 'REC02', 'VALUE 01B', 'VALUE 02B', 'VALUE 03B'

    UNION ALL SELECT 'REC03', 'VALUE 01C', 'VALUE 02C', 'VALUE 03C'

    UNION ALL SELECT 'REC04', 'VALUE 01D', 'VALUE 02D', 'VALUE 03D'

    DECLARE @xml1 xml = (SELECT * FROM @t FOR XML PATH('record'), ROOT('root'))

    SELECT @xml1

    SELECT

    ID = t1.n.value('./ID[1]', 'integer'),

    key1 = t1.n.value('./key1[1]', 'char(10)'),

    val1 = t1.n.value('./val1[1]', 'varchar(50)'),

    val2 = t1.n.value('./val2[1]', 'varchar(50)'),

    val3 = t1.n.value('./val3[1]', 'varchar(50)')

    FROM @xml1.nodes('./root/record') AS t1 (n);

  • Thanks Mark and Paul for the suggestions.

    The case I came up with is for a training session so I'm trying to illustrate a point. I understand how to shred if it has a named root node (I have other examples that cover this).

    I am happy to use this to illustrate how not to form XML. I was just afraid that at my experience level, I was missing something.

    Mark - I'll need to analyze your suggestion closely to figure out what you're doing. MS might as well have written the documentation in Klingon for all the good it does me.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • -- post deleted...

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • Of course the xml should be improved on for any practical implementation, but still this challenged me. Since this is hypothetical already and on top of that Mark is in this discussion too, I thought I'd take this a little further. Mark recently introduced me to << in an xquery (or is it xpath?) expression. I have so far not been able to find what exactly it does :crazy:. I started experimenting with it for this challenge... I have now found how to identify the next ID element following each ID element in the document, but I don't know how to continue...

    Here's me hoping Mark can shed some more light on what exactly this <<-operator (and apparently the opposite >> works too?) does and maybe from my experimenting others get an idea how we can get a working solution for this challenge.

    What I've come up with finds for each ID-element the next ID element in the xml, i.e. all values for this ID must be in the elements in between these two elements. I don't know how to continue from here, but maybe someone else can help?

    DECLARE @t TABLE

    (ID INT IDENTITY, key1 CHAR(10), val1 VARCHAR(50)

    ,val2 VARCHAR(50), val3 VARCHAR(50))

    INSERT INTO @t

    SELECT 'REC01', 'VALUE 01A', 'VALUE 02A', 'VALUE 03A'

    UNION ALL SELECT 'REC02', 'VALUE 01B', 'VALUE 02B', 'VALUE 03B'

    UNION ALL SELECT 'REC03', 'VALUE 01C', 'VALUE 02C', 'VALUE 03C'

    UNION ALL SELECT 'REC04', 'VALUE 01D', 'VALUE 02D', 'VALUE 03D'

    DECLARE @xml1 XML, @xml2 XML

    SELECT @xml1 = (SELECT * FROM @t FOR XML PATH(''))

    SELECT @xml1

    select n.n.value('.','int') as ID,

    n.n.query('for $b in . return $b/../*[. >> $b][local-name(.) = "ID"][1]')

    from @xml1.nodes('/ID') n(n)

    order by 1

    The results so far:

    ID (No column name)

    1 <ID>2</ID>

    2 <ID>3</ID>

    3 <ID>4</ID>

    4



    Posting Data Etiquette - Jeff Moden[/url]
    Posting Performance Based Questions - Gail Shaw[/url]
    Hidden RBAR - Jeff Moden[/url]
    Cross Tabs and Pivots - Jeff Moden[/url]
    Catch-all queries - Gail Shaw[/url]


    If you don't have time to do it right, when will you have time to do it over?

  • R.P.Rozema (5/24/2012)


    Of course the xml should be improved on for any practical implementation, but still this challenged me. Since this is hypothetical already and on top of that Mark is in this discussion too, I thought I'd take this a little further. Mark recently introduced me to << in an xquery (or is it xpath?) expression. I have so far not been able to find what exactly it does :crazy:. I started experimenting with it for this challenge... I have now found how to identify the next ID element following each ID element in the document, but I don't know how to continue...

    Here's me hoping Mark can shed some more light on what exactly this <<-operator (and apparently the opposite >> works too?) does and maybe from my experimenting others get an idea how we can get a working solution for this challenge.

    What I've come up with finds for each ID-element the next ID element in the xml, i.e. all values for this ID must be in the elements in between these two elements. I don't know how to continue from here, but maybe someone else can help?

    DECLARE @t TABLE

    (ID INT IDENTITY, key1 CHAR(10), val1 VARCHAR(50)

    ,val2 VARCHAR(50), val3 VARCHAR(50))

    INSERT INTO @t

    SELECT 'REC01', 'VALUE 01A', 'VALUE 02A', 'VALUE 03A'

    UNION ALL SELECT 'REC02', 'VALUE 01B', 'VALUE 02B', 'VALUE 03B'

    UNION ALL SELECT 'REC03', 'VALUE 01C', 'VALUE 02C', 'VALUE 03C'

    UNION ALL SELECT 'REC04', 'VALUE 01D', 'VALUE 02D', 'VALUE 03D'

    DECLARE @xml1 XML, @xml2 XML

    SELECT @xml1 = (SELECT * FROM @t FOR XML PATH(''))

    SELECT @xml1

    select n.n.value('.','int') as ID,

    n.n.query('for $b in . return $b/../*[. >> $b][local-name(.) = "ID"][1]')

    from @xml1.nodes('/ID') n(n)

    order by 1

    The results so far:

    ID (No column name)

    1 <ID>2</ID>

    2 <ID>3</ID>

    3 <ID>4</ID>

    4

    Have a look at "Node Order Comparison Operators" here, it's a bit sketchy though.

    http://msdn.microsoft.com/en-us/library/ms190935.aspx

    I've updated the query I originally posted to handle the case of missing elements - it uses a difference of

    ROW_NUMBERs rather than the ">>" operator which is, as you pointed out, much faster.

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • Mark-101232 (5/24/2012)


    Try this, I can't see a simpler way with the XML as is.

    WITH CTE AS (

    SELECT row.value('local-name(.)', 'VARCHAR(50)') AS name

    ,row.value('text()[1]', 'VARCHAR(50)') AS val

    ,ROW_NUMBER() OVER(ORDER BY t.row) AS rn1

    ,ROW_NUMBER() OVER(PARTITION BY CASE WHEN row.value('local-name(.)', 'VARCHAR(50)')='ID' THEN 1 ELSE 0 END ORDER BY t.row) AS rn2

    FROM (SELECT @xml1 AS XML1) x

    CROSS APPLY XML1.nodes('/*') t(row))

    SELECT MAX(CASE WHEN name='ID' THEN CAST(val AS INT) END) AS ID,

    MAX(CASE WHEN name='key1' THEN CAST(val AS CHAR(10)) END) AS key1,

    MAX(CASE WHEN name='val1' THEN CAST(val AS VARCHAR(50)) END) AS val1,

    MAX(CASE WHEN name='val2' THEN CAST(val AS VARCHAR(50)) END) AS val2,

    MAX(CASE WHEN name='val3' THEN CAST(val AS VARCHAR(50)) END) AS val3

    FROM CTE

    GROUP BY CASE WHEN name='ID' THEN rn2 ELSE rn1-rn2 END;

    Edit : Fixed to handle case when group does not have all elements present.

    This is a very interesting solution and I'm only beginning to understand how it works. In my effort to do so, I found another alternative based on it that also works.

    ;WITH CTE AS (

    SELECT row.value('local-name(.)', 'VARCHAR(50)') AS name

    ,row.value('text()[1]', 'VARCHAR(50)') AS val

    ,ROW_NUMBER() OVER(ORDER BY t.row) AS rn1

    ,ROW_NUMBER() OVER(PARTITION BY CASE WHEN row.value('local-name(.)', 'VARCHAR(50)')='ID' THEN 1 ELSE 0 END ORDER BY t.row) AS rn2

    FROM (SELECT @xml1 AS XML1) x

    CROSS APPLY XML1.nodes('/*') t(row))

    SELECT ID=MAX(CAST(ID AS INT)), key1=MAX(CAST(key1 AS CHAR(10)))

    , val1=MAX(CAST(val1 AS VARCHAR(50))), val2=MAX(CAST(val2 AS VARCHAR(50)))

    , val3=MAX(CAST(val3 AS VARCHAR(50)))

    FROM CTE

    PIVOT (MAX(val) FOR name IN ([ID], [key1], [val1], [val2], [val3])) pt

    GROUP BY CASE WHEN ID IS NOT NULL THEN rn2 ELSE rn1-rn2 END

    I will continue to research until the light bulb comes on.:hehe:

    Thanks again!

    Edit: Corrected to include CASTing of results.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Mark-101232 (5/24/2012)


    R.P.Rozema (5/24/2012)


    Of course the xml should be improved on for any practical implementation, but still this challenged me. Since this is hypothetical already and on top of that Mark is in this discussion too, I thought I'd take this a little further. Mark recently introduced me to << in an xquery (or is it xpath?) expression. I have so far not been able to find what exactly it does :crazy:. I started experimenting with it for this challenge... I have now found how to identify the next ID element following each ID element in the document, but I don't know how to continue...

    Here's me hoping Mark can shed some more light on what exactly this <<-operator (and apparently the opposite >> works too?) does and maybe from my experimenting others get an idea how we can get a working solution for this challenge.

    What I've come up with finds for each ID-element the next ID element in the xml, i.e. all values for this ID must be in the elements in between these two elements. I don't know how to continue from here, but maybe someone else can help?

    DECLARE @t TABLE

    (ID INT IDENTITY, key1 CHAR(10), val1 VARCHAR(50)

    ,val2 VARCHAR(50), val3 VARCHAR(50))

    INSERT INTO @t

    SELECT 'REC01', 'VALUE 01A', 'VALUE 02A', 'VALUE 03A'

    UNION ALL SELECT 'REC02', 'VALUE 01B', 'VALUE 02B', 'VALUE 03B'

    UNION ALL SELECT 'REC03', 'VALUE 01C', 'VALUE 02C', 'VALUE 03C'

    UNION ALL SELECT 'REC04', 'VALUE 01D', 'VALUE 02D', 'VALUE 03D'

    DECLARE @xml1 XML, @xml2 XML

    SELECT @xml1 = (SELECT * FROM @t FOR XML PATH(''))

    SELECT @xml1

    select n.n.value('.','int') as ID,

    n.n.query('for $b in . return $b/../*[. >> $b][local-name(.) = "ID"][1]')

    from @xml1.nodes('/ID') n(n)

    order by 1

    The results so far:

    ID (No column name)

    1 <ID>2</ID>

    2 <ID>3</ID>

    3 <ID>4</ID>

    4

    Have a look at "Node Order Comparison Operators" here, it's a bit sketchy though.

    http://msdn.microsoft.com/en-us/library/ms190935.aspx

    I've updated the query I originally posted to handle the case of missing elements - it uses a difference of

    ROW_NUMBERs rather than the ">>" operator which is, as you pointed out, much faster.

    This suggestion gives me an interesting idea which I am clueless how to act on (haven't read the link yet but I will).

    Would it be possible to store each element group in a row (with ID in a column as in the above result set), with a second XML column that contains all of the XML for that element group?


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (5/24/2012)


    This suggestion gives me an interesting idea which I am clueless how to act on (haven't read the link yet but I will).

    Would it be possible to store each element group in a row (with ID in a column as in the above result set), with a second XML column that contains all of the XML for that element group?

    Yes indeed, this can be done.

    SELECT row.value('ID[1]', 'INT') as ID

    ,row.value('key1[1]', 'CHAR(10)') as key1

    ,row.value('val1[1]', 'VARCHAR(50)') as var1

    ,row.value('val2[1]', 'VARCHAR(50)') as var2

    ,row.value('val3[1]', 'VARCHAR(50)') as var3

    FROM (

    SELECT CAST('<ID>' + strcol AS XML) AS xml1

    FROM (SELECT CAST(@xml1 AS VARCHAR(MAX))) AS x(MyXML)

    CROSS APPLY dbo.SplitString('<ID>', SUBSTRING(MyXML, 5, LEN(MyXML)))) x

    CROSS APPLY XML1.nodes('.') t(row)


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (5/24/2012)


    Mark-101232 (5/24/2012)


    Try this, I can't see a simpler way with the XML as is.

    WITH CTE AS (

    SELECT row.value('local-name(.)', 'VARCHAR(50)') AS name

    ,row.value('text()[1]', 'VARCHAR(50)') AS val

    ,ROW_NUMBER() OVER(ORDER BY t.row) AS rn1

    ,ROW_NUMBER() OVER(PARTITION BY CASE WHEN row.value('local-name(.)', 'VARCHAR(50)')='ID' THEN 1 ELSE 0 END ORDER BY t.row) AS rn2

    FROM (SELECT @xml1 AS XML1) x

    CROSS APPLY XML1.nodes('/*') t(row))

    SELECT MAX(CASE WHEN name='ID' THEN CAST(val AS INT) END) AS ID,

    MAX(CASE WHEN name='key1' THEN CAST(val AS CHAR(10)) END) AS key1,

    MAX(CASE WHEN name='val1' THEN CAST(val AS VARCHAR(50)) END) AS val1,

    MAX(CASE WHEN name='val2' THEN CAST(val AS VARCHAR(50)) END) AS val2,

    MAX(CASE WHEN name='val3' THEN CAST(val AS VARCHAR(50)) END) AS val3

    FROM CTE

    GROUP BY CASE WHEN name='ID' THEN rn2 ELSE rn1-rn2 END;

    Edit : Fixed to handle case when group does not have all elements present.

    This is a very interesting solution and I'm only beginning to understand how it works. In my effort to do so, I found another alternative based on it that also works.

    ;WITH CTE AS (

    SELECT row.value('local-name(.)', 'VARCHAR(50)') AS name

    ,row.value('text()[1]', 'VARCHAR(50)') AS val

    ,ROW_NUMBER() OVER(ORDER BY t.row) AS rn1

    ,ROW_NUMBER() OVER(PARTITION BY CASE WHEN row.value('local-name(.)', 'VARCHAR(50)')='ID' THEN 1 ELSE 0 END ORDER BY t.row) AS rn2

    FROM (SELECT @xml1 AS XML1) x

    CROSS APPLY XML1.nodes('/*') t(row))

    SELECT ID=MAX(CAST(ID AS INT)), key1=MAX(CAST(key1 AS CHAR(10)))

    , val1=MAX(CAST(val1 AS VARCHAR(50))), val2=MAX(CAST(val2 AS VARCHAR(50)))

    , val3=MAX(CAST(val3 AS VARCHAR(50)))

    FROM CTE

    PIVOT (MAX(val) FOR name IN ([ID], [key1], [val1], [val2], [val3])) pt

    GROUP BY CASE WHEN ID IS NOT NULL THEN rn2 ELSE rn1-rn2 END

    I will continue to research until the light bulb comes on.:hehe:

    Thanks again!

    Edit: Corrected to include CASTing of results.

    The difference is that Mark used a cross-tab, where you turned to a pivot. Cross tabs are most of the times faster than pivots because it lacks one operation applied to each value by pivot. Cross tab syntax is also considered better readable by most people, but the most important advantage is that cross tabs have a much broader field of problems they can be applied to than pivot. Jeff wrote a very good article comparing the two: http://qa.sqlservercentral.com/articles/T-SQL/63681/.



    Posting Data Etiquette - Jeff Moden[/url]
    Posting Performance Based Questions - Gail Shaw[/url]
    Hidden RBAR - Jeff Moden[/url]
    Cross Tabs and Pivots - Jeff Moden[/url]
    Catch-all queries - Gail Shaw[/url]


    If you don't have time to do it right, when will you have time to do it over?

  • Yet another method that works but is ugly and probably not recommended. I like my previous one better.

    SELECT @xml1 = (SELECT * FROM @t FOR XML PATH(''))

    -- Add in row tags

    SELECT row.value('ID[1]', 'INT') as ID

    ,row.value('key1[1]', 'CHAR(10)') as key1

    ,row.value('val1[1]', 'VARCHAR(50)') as var1

    ,row.value('val2[1]', 'VARCHAR(50)') as var2

    ,row.value('val3[1]', 'VARCHAR(50)') as var3

    FROM (

    SELECT CAST(REPLACE(Tag1, '</val3>', '</val3></row>') AS XML) AS XML1

    FROM (

    SELECT Tag1 FROM (SELECT CAST(@xml1 AS VARCHAR(MAX))) z(xml1)

    CROSS APPLY (SELECT REPLACE(xml1, '<ID>', '<row><ID>')) y(Tag1)

    ) x ) z

    CROSS APPLY XML1.nodes('row') t(row)


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • R.P.Rozema (5/25/2012)


    The difference is that Mark used a cross-tab, where you turned to a pivot. Cross tabs are most of the times faster than pivots because it lacks one operation applied to each value by pivot. Cross tab syntax is also considered better readable by most people, but the most important advantage is that cross tabs have a much broader field of problems they can be applied to than pivot. Jeff wrote a very good article comparing the two: http://qa.sqlservercentral.com/articles/T-SQL/63681/.

    It's a matter of taste I suppose. Not sure either one is very readable and I've heard the same about performance of PIVOTs (hence I rarely use them). I've seen the link you provided but thanks nonetheless.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (5/24/2012)


    Mark-101232 (5/24/2012)


    Try this, I can't see a simpler way with the XML as is.

    WITH CTE AS (

    SELECT row.value('local-name(.)', 'VARCHAR(50)') AS name

    ,row.value('text()[1]', 'VARCHAR(50)') AS val

    ,ROW_NUMBER() OVER(ORDER BY t.row) AS rn1

    ,ROW_NUMBER() OVER(PARTITION BY CASE WHEN row.value('local-name(.)', 'VARCHAR(50)')='ID' THEN 1 ELSE 0 END ORDER BY t.row) AS rn2

    FROM (SELECT @xml1 AS XML1) x

    CROSS APPLY XML1.nodes('/*') t(row))

    SELECT MAX(CASE WHEN name='ID' THEN CAST(val AS INT) END) AS ID,

    MAX(CASE WHEN name='key1' THEN CAST(val AS CHAR(10)) END) AS key1,

    MAX(CASE WHEN name='val1' THEN CAST(val AS VARCHAR(50)) END) AS val1,

    MAX(CASE WHEN name='val2' THEN CAST(val AS VARCHAR(50)) END) AS val2,

    MAX(CASE WHEN name='val3' THEN CAST(val AS VARCHAR(50)) END) AS val3

    FROM CTE

    GROUP BY CASE WHEN name='ID' THEN rn2 ELSE rn1-rn2 END;

    Edit : Fixed to handle case when group does not have all elements present.

    This is a very interesting solution and I'm only beginning to understand how it works. In my effort to do so, I found another alternative based on it that also works.

    ;WITH CTE AS (

    SELECT row.value('local-name(.)', 'VARCHAR(50)') AS name

    ,row.value('text()[1]', 'VARCHAR(50)') AS val

    ,ROW_NUMBER() OVER(ORDER BY t.row) AS rn1

    ,ROW_NUMBER() OVER(PARTITION BY CASE WHEN row.value('local-name(.)', 'VARCHAR(50)')='ID' THEN 1 ELSE 0 END ORDER BY t.row) AS rn2

    FROM (SELECT @xml1 AS XML1) x

    CROSS APPLY XML1.nodes('/*') t(row))

    SELECT ID=MAX(CAST(ID AS INT)), key1=MAX(CAST(key1 AS CHAR(10)))

    , val1=MAX(CAST(val1 AS VARCHAR(50))), val2=MAX(CAST(val2 AS VARCHAR(50)))

    , val3=MAX(CAST(val3 AS VARCHAR(50)))

    FROM CTE

    PIVOT (MAX(val) FOR name IN ([ID], [key1], [val1], [val2], [val3])) pt

    GROUP BY CASE WHEN ID IS NOT NULL THEN rn2 ELSE rn1-rn2 END

    I will continue to research until the light bulb comes on.:hehe:

    Thanks again!

    Edit: Corrected to include CASTing of results.

    It's a variation on the "Gaps and Islands" technique, see Jeff Moden's excellent article here

    http://qa.sqlservercentral.com/articles/T-SQL/71550/

    Have a looks at the results of the query below, it should make things clearer.

    WITH CTE AS (

    SELECT row.value('local-name(.)', 'VARCHAR(50)') AS name

    ,row.value('text()[1]', 'VARCHAR(50)') AS val

    ,ROW_NUMBER() OVER(ORDER BY t.row) AS rn1

    ,ROW_NUMBER() OVER(PARTITION BY CASE WHEN row.value('local-name(.)', 'VARCHAR(50)')='ID' THEN 1 ELSE 0 END ORDER BY t.row) AS rn2

    FROM (SELECT @xml1 AS XML1) x

    CROSS APPLY XML1.nodes('/*') t(row))

    SELECT name,

    val,

    rn1,

    rn2,

    CASE WHEN name='ID' THEN rn2 END AS IDrn,

    CASE WHEN name<>'ID' THEN rn1-rn2 END AS NotIDrn,

    CASE WHEN name='ID' THEN rn2 ELSE rn1-rn2 END AS GroupByThis

    FROM CTE

    ORDER BY rn1;

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • You have no idea what a time-saver you just posted for me. I needed to search out that very same gaps and islands article for something else I'm working on. Now I have it.

    It also does help to explain the solution you provided. Thanks much!


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply