Seeking alternate method to CTE

  • Hello All,

    I am seeking for an alternate method to Common table Expression. In the nutshell, I like to drill down a tree structure to get all IDs of data nodes who data have been modified in the last x days. I used CTE for this, it works however for the size of database we have it takes few minutes to completed, which is not acceptable for us, I want to see if this can be done in seconds instead. Here are some facts about our database, this is just simplified DDL our actual tables use UNIQUEIDENTIFIER as datatype instead of INT and each table have alot more fields than in this example. The MyUpdateTracking contains around 29million records, MyTree, MyData ~ 11millon records each.

    Thank you in advance for anyone dare to help me on this.

    Regards

    CREATE TABLE MyOwner(

    OwnerID int,

    Name varchar(50)

    )

    GO

    insert into MyOwner(OwnerID,Name)

    select 1,'Owner1' union all

    select 2,'Owner2' union all

    select 3,'Owner3' union all

    select 4,'Owner4'

    CREATE TABLE MyData(

    DataID int ,

    DataName varchar(50),

    OwnerID int

    )

    GO

    insert into MyData(DataID,DataName,OwnerID)

    select 1,'Data1',1 union all

    select 2,'Data2',2 union all

    select 3,'Data3',3 union all

    select 4,'Data4',4 union all

    select 5,'Data5',4 union all

    select 6,'Data6',4 union all

    select 7,'Data7',4

    CREATE TABLE MyTree(

    ID int identity(1,1) ,

    ParentDataID int,

    ChildDataID int

    )

    GO

    insert into MyTree(ParentDataID,ChildDataID)

    select 1,2 union all

    select 2,3 union all

    select 2,5 union all

    select 2,6 union all

    select 3,4 union all

    select 3,7

    CREATE TABLE MyUpdateTracking(

    DataID int,

    UpdateID int

    )

    GO

    insert into MyUpdateTracking(DataID,UpdateID)

    select 1,1 union all

    select 2,2 union all

    select 3,3 union all

    select 4,3 union all

    select 5,3 union all

    select 6,3 union all

    select 7,3

    CREATE TABLE MyUpdateEvent(

    UpdateID int,

    TimeStart datetime

    )

    GO

    insert into MyUpdateEvent(UpdateID,UpdatedDate)

    select 1,GETDATE() union all

    select 2,GETDATE() union all

    select 3,GETDATE()

    DECLARE @NodeID int

    set @NodeID = 1;

    WITH TempCTE AS (

    SELECT ParentDataID, ChildDataID, 0 as Depth

    FROM MyTree

    WHERE ParentDataID=@NodeID

    UNION ALL

    SELECT R.ParentDataID, R.ChildDataID, Depth + 1

    FROM MyTree R

    INNER JOIN TempCTE CTE on CTE.ChildDataID = R.ParentDataID

    )

    select CTE.ChildDataID,CTE.Depth, D.DataName,e.UpdatedDate

    from TempCTE CTE

    inner join MyData D on D.DataID=CTE.ChildDataID

    inner join MyOwner O on O.OwnerID=D.OwnerID

    inner join MyUpdateTracking T on T.DataID=CTE.ChildDataID

    inner join MyUpdateEvent E on e.UpdateID=T.UpdateID

    where DATEDIFF(DAY, E.UpdatedDate , GETDATE()) <= 3;

  • Trees in sql require a little different modeling and query strategies.

    a quick google search on "Trees in SQL" reveals tons of articles.

    Joe Celko's book on the exact subject discusses this.

    here's a good read also.

    www.codeproject.com/Articles/8355/Trees-in-SQL-databases

  • What are the indexes on thoee tables, specifically mytree? Try running the results of the rCTE into a #temp table and joining the other tables to it.


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • ChrisM@home (10/4/2012)


    What are the indexes on thoee tables, specifically mytree? Try running the results of the rCTE into a #temp table and joining the other tables to it.

    Chris,

    There are both clusteredd & non clustered indexes on both ParentDataID, ChildDataID, UpdatedDate.

    I will try your suggestion

    Updated: I modified to use temp table but the performace is not improved significantly

    Thanks!

  • Ray M (10/4/2012)


    Trees in sql require a little different modeling and query strategies.

    a quick google search on "Trees in SQL" reveals tons of articles.

    Joe Celko's book on the exact subject discusses this.

    here's a good read also.

    www.codeproject.com/Articles/8355/Trees-in-SQL-databases

    Thanks, I will look at the article.

    Updated: I read the article, however having triggers behind these tables is probably not a good idea for my case as that will add extra overhead to our app as it is already heavy. we have schedule job that runs periodically to import data to the database

  • Without a copy of the actual execution plan, it's impossible to tell for sure but I do see one potential problem that guarantees and index or table scan because of the way it's written. Here's that code snippet...

    where DATEDIFF(DAY, E.UpdatedDate , GETDATE()) <= 3;

    Like I said, that code guarantees an index scan of at least the table the "E" alias is pointing at. The code should be rewritten as follows to allow for an index seek (if you have a proper index for it) which should beat the tar out of the scan.

    WHERE E.UpdatedDate >= DATEADD(dd,-3,GETDATE())

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden (10/4/2012)


    Without a copy of the actual execution plan, it's impossible to tell for sure but I do see one potential problem that guarantees and index or table scan because of the way it's written. Here's that code snippet...

    where DATEDIFF(DAY, E.UpdatedDate , GETDATE()) <= 3;

    Like I said, that code guarantees an index scan of at least the table the "E" alias is pointing at. The code should be rewritten as follows to allow for an index seek (if you have a proper index for it) which should beat the tar out of the scan.

    WHERE E.UpdatedDate >= DATEADD(dd,-3,GETDATE())

    Still no improvement at all.

  • haiao2000 (10/4/2012)


    Jeff Moden (10/4/2012)


    Without a copy of the actual execution plan, it's impossible to tell for sure but I do see one potential problem that guarantees and index or table scan because of the way it's written. Here's that code snippet...

    where DATEDIFF(DAY, E.UpdatedDate , GETDATE()) <= 3;

    Like I said, that code guarantees an index scan of at least the table the "E" alias is pointing at. The code should be rewritten as follows to allow for an index seek (if you have a proper index for it) which should beat the tar out of the scan.

    WHERE E.UpdatedDate >= DATEADD(dd,-3,GETDATE())

    Maybe you could add the execution plan!

    Still no improvement at all.

  • Ignacio A. Salom Rangel (10/4/2012)


    haiao2000 (10/4/2012)


    Jeff Moden (10/4/2012)


    Without a copy of the actual execution plan, it's impossible to tell for sure but I do see one potential problem that guarantees and index or table scan because of the way it's written. Here's that code snippet...

    where DATEDIFF(DAY, E.UpdatedDate , GETDATE()) <= 3;

    Like I said, that code guarantees an index scan of at least the table the "E" alias is pointing at. The code should be rewritten as follows to allow for an index seek (if you have a proper index for it) which should beat the tar out of the scan.

    WHERE E.UpdatedDate >= DATEADD(dd,-3,GETDATE())

    Maybe you could add the execution plan!

    Still no improvement at all.

    See attachment. notice it is ran against our actual db, so it will not look the same as posted DDL but the concept is the same.

    Updated. no answer...attachment deleted. 🙂

    Thanks!

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply