How will you write a query for this scenario

  • Jeff Moden (8/9/2011)


    Yowch! 2 WEEKS for the original code? Heh... talk about your basic long running queries. That really does fall in the category of "What were they thinking"? And you got it down to 35 minutes on older hardware.

    The old system was written in some strange front end language (never did figure out what) and I think DB2 and it stepped through the records row by row, multiple times.

    The vendor who wrote the old system (whom my client wanted to try and replace) swore that we'd never get SQL Server to process the entire resultset (50 million records = 3 months data) in under a month.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (8/9/2011)


    Jeff Moden (8/9/2011)


    Yowch! 2 WEEKS for the original code? Heh... talk about your basic long running queries. That really does fall in the category of "What were they thinking"? And you got it down to 35 minutes on older hardware.

    The old system was written in some strange front end language (never did figure out what) and I think DB2 and it stepped through the records row by row, multiple times.

    The vendor who wrote the old system (whom my client wanted to try and replace) swore that we'd never get SQL Server to process the entire resultset (50 million records = 3 months data) in under a month.

    1 MONTH?!?!?

    I wish I had seen their face when they heard the final time to run the process :-D.

    Would have been even funnier if it had been under a minute :w00t:... just to really drive the point in!

  • Vermarajiv - here is a better WHILE w/o using expensive btree DELETE:

    SET NOCOUNT ON -- elim intmed display, note: @@rowcount is not affected

    CREATE TABLE #t (pid int, dt datetime)

    INSERT INTO #t values (1, '2/1/2011')

    INSERT INTO #t values (1, '2/10/2011')

    INSERT INTO #t values (1, '6/5/2011')

    INSERT INTO #t values (1, '6/30/2011')

    INSERT INTO #t values (1, '7/8/2011')

    INSERT INTO #t values (2, '3/1/2011')

    INSERT INTO #t values (2, '5/3/2011')

    DECLARE @depth int=0

    CREATE TABLE #r (pid int, dt datetime, depth int)-- Result

    CREATE UNIQUE INDEX idx_all on #r(depth,pid) INCLUDE(dt)

    INSERT #r (depth, pid, dt)

    SELECT @depth, t.pid, MIN(t.dt)

    FROM #t t

    GROUP BY t.pid

    WHILE (1>0 AND @depth<9999) -- prevent run away

    BEGIN -- Get the min date record

    INSERT #r (depth, pid, dt)

    SELECT @depth+1, t.pid, MIN(t.dt)

    FROM #r r

    INNER JOIN #t t ON r.pid=t.pid AND dateadd(dd,30,r.dt) < t.dt

    WHERE r.depth = @depth

    GROUP BY t.pid

    IF (@@ROWCOUNT < 1) BREAK -- all done

    SET @depth =@depth+1

    END

    SELECT * FROM #r ORDER BY pid,dt

    Beyond this, a CLR table-function might yields a better performance - freakishly fast RBAR;-) if memory/cpu and or tvp/batch taken into account...

Viewing 3 posts - 31 through 32 (of 32 total)

You must be logged in to reply to this topic. Login to reply