Can you use Lag to avoid recursive CTEs?

  • Let's say I have a scalar functions that I'd like it's input to be the output from the previous row, with a recursive CTE I can do the following:

    ;with rCTE(iterations,computed) as (

    select 0 [iterations], 1e+0 [computed]

    union all

    select iterations+1,dbo.f(computed)

    from where rCTE

    where iterations < 30

    )

    select * from rCTE

    Thus for each iteration, is the nTh fold of the function f applied to 1. [e.g 5 is f(f(f(f(f(1)))))]

    However, for some illogical reason this relatively simple function did lots of read and write in tempdb. Can I reform this query somehow to just use lag instead? I know for a fact I only want to get let's say 30 iterations. It'd be very nice to be able to enjoy a window spool which will spawn a worktable with minimal IO.

    I know I can put 30 rows into a table variable and do a quirky update across it, but Is there a nice way to do this without doing some sort of hack.

  • If you need access to the intermediate computation from the previous row then LAG will not do it.

    If I recall well, there was a suggestion from Marcello Poletti, with the help of Erland Sommarskog, for a window function (Previous) that could help in these cases.

    Calcolo progressivo e funzione Previous in SQL Server

    I am not sure if it made it into Microsoft Connect.

  • First, I do believe you can do this with using LAG. I don't have access to a 2012 instance at the moment; otherwise I could post an example.

    That said, I was recently working on a T-SQL TRANSLATE function (like what exists in DB2 and Oracle). The way I did it requires the the query read the value from the previous row. Below are three examples of how I achieved this. The first is using and rCTE second technique uses the "Quirky Update" which safe and predictable provided that you follow the "Rules" outlined in this article [/url]by Jeff Moden. The third technique is using a Tally table. I hope this helps.

    method #1: using rCte

    DECLARE @string varchar(8000) = 'xxxyyyzzz',

    @pre varchar(100) = 'xyz',

    @post varchar(100) = 'abc',

    @newString varchar(8000);

    WITH x AS

    (

    SELECT 1 AS n,

    REPLACE(@string,substring(@pre,1,1),substring(@post,1,1)) AS s

    UNION ALL

    SELECT n+1,

    REPLACE(s,substring(@pre,n+1,1),substring(@post,n+1,1)) AS s

    FROM x

    WHERE n<LEN(@pre))

    SELECT TOP(1) s AS newString

    FROM x

    WHERE n=LEN(@pre);

    method #2: Quirky Update

    DECLARE @string varchar(8000) = 'xxxyyyzzz',

    @pre varchar(100) = 'xyz',

    @post varchar(100) = 'abc',

    @newString varchar(8000);

    DECLARE @temp TABLE(n int primary key, string varchar(8000));

    INSERT @temp

    SELECT TOP (len(@pre)) ROW_NUMBER() over (order by (select null)), ''

    FROM sys.all_columns;

    UPDATE @temp

    SET @newString = string = REPLACE(ISNULL(@newstring,@string),

    SUBSTRING(@pre,n,1),

    SUBSTRING(@post,n,1))

    SELECT @newString AS newString;

    method #3: using a tally table

    DECLARE @string varchar(8000) = 'xxxyyyzzz',

    @pre varchar(100) = 'xyz',

    @post varchar(100) = 'abc',

    @newString varchar(8000);

    ;WITH

    iTally AS

    (

    SELECT TOP(LEN(@pre)) ROW_NUMBER() OVER (ORDER BY (SELECT ($))) n

    FROM sys.all_columns

    )

    SELECT @newstring = REPLACE(ISNULL(@newstring,@string),

    SUBSTRING(@pre,n,1),

    SUBSTRING(@post,n,1))

    FROM iTally

    WHERE n<=LEN(@pre);

    SELECT @newstring AS newString;

    Cheers.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • As to noticing that rCTEs produce a lot of reads, it's because it's "just" a loop that's nearly identical to reading the first row and then looping through all of the others. Using rCTEs to count can be pretty bad. Please see the following article.

    http://qa.sqlservercentral.com/articles/T-SQL/74118/

    As for using LAG for previous row calculations, the answer is "Yes you can". Please see the following article for a classic example.

    http://blog.waynesheffield.com/wayne/archive/2011/08/running-totals-in-denali-ctp3/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Yes, I'm aware as I've already read that article and I rarely use rCTEs for that reason.

    While you can use the few blessed aggregate function that support an over clause, how can I perform arbitrary functions based on the result of the previous row, without the use of an rCTE or a quirky update?

    Certainly quirky update works:

    select top 30 identity(int) [iterations],0e [computed] into #temp from sys.all_columns

    declare @C float =1

    update #temp

    set @C = computed = dbo.f(@c)

    But it seems that there is no way to treat "lag" like a "previous" function. I can't say

    select top(30) n,computed=dbo.f(lag(computed),1,1) over (order by n))

    from (select row_number() over (order by (select null)) from sys.all_columns) z(n)

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply