The recursive part of a CTE do not allow aggregate functions

  • There is a SQL Server undocumented extended stored procedure called xp_dirtree, which can return all files and folders name (include subfolders) in a table format. To practice my understanding of recursive CTE, I decide to use it to get the full path of all files in a specified folder(include subfolders). However, after an hour of head scratch I still can't figure out the correct way to do it. The main issue is that CTE recursive part do not allow aggregate functions. If it is allow, I'll write the following code. I want to know can this purpose be implemented with recursive CTE?

    DECLARE @dir NVARCHAR(260);

    SELECT @dir = N'c:\temp';

    IF RIGHT(@dir, 1) <> '\'

    SELECT @dir = @dir + '\';

    IF OBJECT_ID('tempdb..#dirtree', 'U') IS NOT NULL

    DROP TABLE #dirtree;

    CREATE TABLE #dirtree

    (

    id INT PRIMARY KEY

    IDENTITY,

    subdirectory NVARCHAR(260),

    depth INT,

    is_file BIT

    );

    INSERT INTO #dirtree

    EXEC xp_dirtree @dir, 0, 1;

    SELECT *

    FROM #dirtree;

    WITH cte

    AS (SELECT t.id,

    t.subdirectory,

    t.depth,

    t.is_file,

    CAST (t.subdirectory AS NVARCHAR(MAX)) AS path

    FROM #dirtree AS t

    WHERE is_file = 1

    UNION ALL

    SELECT d.id,

    d.subdirectory,

    d.depth,

    d.is_file,

    d.subdirectory + '\' + c.path

    FROM cte c

    JOIN #dirtree AS d ON c.depth = d.depth + 1

    AND d.is_file = 0

    AND d.id = (SELECT MAX(id)

    FROM #dirtree d2

    WHERE d2.depth = c.depth - 1

    )

    )

    SELECT @dir + cte.subdirectory

    FROM cte;

    Suppose what I have is:

    /*

    id subdirectory depth is_file

    --- -------------- ------- -------

    1 abc.mdf 1 1

    2 a 1 0

    3 a.txt 2 1

    4 b.txt 2 1

    5 a.rb 1 1

    6 aaa.flv 1 1

    */

    The result I want is :

    /*

    path

    ------------------

    c:\temp\abc.mdf

    c:\temp\a\a.txt

    c:\temp\a\b.txt

    c:\temp\a.rb

    c:\temp\aaa.flv

    */

  • Because it's not actually a hierarchy, a recursive CTE isn't the right way to do this. You'll just frustrate yourself with that approach to it.

    Recursive CTEs are really only good at parsing adjacency hierarchies from relational sets. What xp_dirtree returns isn't really in that form.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Hello GSquared, if this can't be implemented by CTE. Do you have a way to do this?

    Thanks.

  • The problem is that xp_dirtree does not return data in a format that is going to help put this back together. It looks like you are trying to get a list of all files and folders within a parent folder? What happens when you have 2 folders? In your example you have only 1 subfolder of c:\temp. If you add a second the files all look the same and you have idea which folder is the parent.

    Consider the following results from a folder on my machine:

    insert #dirtree select 'APEdit.xsd', 1, 1

    insert #dirtree select 'DataSet.xsd', 1, 1

    insert #dirtree select 'Doc1.docx', 1, 1

    insert #dirtree select 'New folder', 1, 0

    insert #dirtree select 'New Text document.txt', 2, 1

    insert #dirtree select 'Nim2CodaTest', 1, 0

    insert #dirtree select 'G.G01182', 2, 1

    insert #dirtree select 'G.G01184', 2, 1

    insert #dirtree select 'G.G30430', 2, 1

    insert #dirtree select 'G.G30441', 2, 1

    insert #dirtree select 'G.S01182', 2, 1

    insert #dirtree select 'G.S01184', 2, 1

    insert #dirtree select 'G.S30430', 2, 1

    insert #dirtree select 'G.S30441', 2, 1

    insert #dirtree select 'OrderK81569.pdf', 1, 1

    insert #dirtree select 'Temp.mdb', 1, 1

    Which folder does "New Text document.txt" belong in? What about G.S30441? See my point? There is nothing in the data to tell you what the parent is.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • I did find a post right on this site with some code that will list this exactly as I understand you are looking for.

    http://qa.sqlservercentral.com/Forums/Topic1104688-8-1.aspx

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (4/25/2012)


    The problem is that xp_dirtree does not return data in a format that is going to help put this back together. It looks like you are trying to get a list of all files and folders within a parent folder? What happens when you have 2 folders? In your example you have only 1 subfolder of c:\temp. If you add a second the files all look the same and you have idea which folder is the parent.

    Consider the following results from a folder on my machine:

    insert #dirtree select 'APEdit.xsd', 1, 1

    insert #dirtree select 'DataSet.xsd', 1, 1

    insert #dirtree select 'Doc1.docx', 1, 1

    insert #dirtree select 'New folder', 1, 0

    insert #dirtree select 'New Text document.txt', 2, 1

    insert #dirtree select 'Nim2CodaTest', 1, 0

    insert #dirtree select 'G.G01182', 2, 1

    insert #dirtree select 'G.G01184', 2, 1

    insert #dirtree select 'G.G30430', 2, 1

    insert #dirtree select 'G.G30441', 2, 1

    insert #dirtree select 'G.S01182', 2, 1

    insert #dirtree select 'G.S01184', 2, 1

    insert #dirtree select 'G.S30430', 2, 1

    insert #dirtree select 'G.S30441', 2, 1

    insert #dirtree select 'OrderK81569.pdf', 1, 1

    insert #dirtree select 'Temp.mdb', 1, 1

    Which folder does "New Text document.txt" belong in? What about G.S30441? See my point? There is nothing in the data to tell you what the parent is.

    You could make an assumption that 'New Text document.txt' belongs in 'New Folder' as it immediately follows it in the list visually. The real problem is that there is nothing in the data to help ensure order when processed by SQL Server. Any attempt to use ORDER BY in a select query will cause an issue.

  • yousui (4/25/2012)


    Hello GSquared, if this can't be implemented by CTE. Do you have a way to do this?

    Thanks.

    I'm accustomed to doing this via VB.NET CLR functions in SQL Server 2005 or later. VB.NET has very easy file system tools in it.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Lynn Pettis (4/25/2012)


    Sean Lange (4/25/2012)


    The problem is that xp_dirtree does not return data in a format that is going to help put this back together. It looks like you are trying to get a list of all files and folders within a parent folder? What happens when you have 2 folders? In your example you have only 1 subfolder of c:\temp. If you add a second the files all look the same and you have idea which folder is the parent.

    Consider the following results from a folder on my machine:

    insert #dirtree select 'APEdit.xsd', 1, 1

    insert #dirtree select 'DataSet.xsd', 1, 1

    insert #dirtree select 'Doc1.docx', 1, 1

    insert #dirtree select 'New folder', 1, 0

    insert #dirtree select 'New Text document.txt', 2, 1

    insert #dirtree select 'Nim2CodaTest', 1, 0

    insert #dirtree select 'G.G01182', 2, 1

    insert #dirtree select 'G.G01184', 2, 1

    insert #dirtree select 'G.G30430', 2, 1

    insert #dirtree select 'G.G30441', 2, 1

    insert #dirtree select 'G.S01182', 2, 1

    insert #dirtree select 'G.S01184', 2, 1

    insert #dirtree select 'G.S30430', 2, 1

    insert #dirtree select 'G.S30441', 2, 1

    insert #dirtree select 'OrderK81569.pdf', 1, 1

    insert #dirtree select 'Temp.mdb', 1, 1

    Which folder does "New Text document.txt" belong in? What about G.S30441? See my point? There is nothing in the data to tell you what the parent is.

    You could make an assumption that 'New Text document.txt' belongs in 'New Folder' as it immediately follows it in the list visually. The real problem is that there is nothing in the data to help ensure order when processed by SQL Server. Any attempt to use ORDER BY in a select query will cause an issue.

    The real problem is like you said, you have to make an assumption. If you take a row isolated you have no clue where it belongs. And putting this through a cte is never going to work because there is nothing in the data to form the relationship. You could maybe brute force it by assuming that the row with the Max(ID) < the current row's ID and is_file = 0 is the parent but that is pretty "loosey goosey".

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply