TSQL - Find maximum running date gap in a date range and also date gap ranking

  • Hi,

    I want to find maximum running date gap in a data set and also want to rank the gap days. Following is the sample data.

    USE TempDB

    GO

    SET LANGUAGE US_ENGLISH

    SET NOCOUNT ON

    IF OBJECT_ID('#temp_date_gap') IS NOT NULL

    DROP TABLE #temp_date_gap

    CREATE TABLE #temp_date_gap

    (

    date_and_time datetime primary key

    )

    INSERT INTO #temp_date_gap VALUES ('2012-05-01 09:00:00.000')

    INSERT INTO #temp_date_gap VALUES ('2012-05-03 09:00:00.000')

    INSERT INTO #temp_date_gap VALUES ('2012-05-06 11:30:00.000')

    INSERT INTO #temp_date_gap VALUES ('2012-06-01 01:00:00.000')

    INSERT INTO #temp_date_gap VALUES ('2012-06-03 00:00:01.000')

    INSERT INTO #temp_date_gap VALUES ('2012-06-20 14:00:00.000')

    INSERT INTO #temp_date_gap VALUES ('2012-08-01 09:00:00.000')

    INSERT INTO #temp_date_gap VALUES ('2012-08-03 09:00:00.000')

    INSERT INTO #temp_date_gap VALUES ('2012-08-06 11:30:00.000')

    INSERT INTO #temp_date_gap VALUES ('2012-09-01 01:00:00.000')

    INSERT INTO #temp_date_gap VALUES ('2012-09-03 00:00:01.000')

    INSERT INTO #temp_date_gap VALUES ('2012-09-20 14:00:00.000')

    select *

    from #temp_date_gap

    1: Output 1 - Maximum Gap with dates

    Last_date_BEFORE_largest_gap First_date_AFTER_largest_gap

    20/06/2012 14:00 01/08/2012 09:00

    2: Output 2 - Gap ranking

    date_and_time_start date_and_time_end running_day_diff Rank

    20/06/2012 14:00 01/08/2012 09:00 42 1

    06/05/2012 11:30 01/06/2012 01:00 26 2

    06/08/2012 11:30 01/09/2012 01:00 26 2

    03/06/2012 00:00 20/06/2012 14:00 18 3

    03/09/2012 00:00 20/09/2012 14:00 18 3

    03/05/2012 09:00 06/05/2012 11:30 3 4

    03/08/2012 09:00 06/08/2012 11:30 3 4

    01/05/2012 09:00 03/05/2012 09:00 2 5

    01/08/2012 09:00 03/08/2012 09:00 2 5

    01/06/2012 01:00 03/06/2012 00:00 2 5

    01/09/2012 01:00 03/09/2012 00:00 2 5

    If possible then can we make this dynamic to find the Nth largest GAP?

    Thanks.

  • You can use this to give you output 2. Output 1 is where Rank is 1

    WITH CTE AS (

    SELECT date_and_time,

    ROW_NUMBER() OVER(ORDER BY date_and_time) AS rn

    FROM #temp_date_gap)

    SELECT t1.date_and_time AS date_and_time_start,

    t2.date_and_time AS date_and_time_end,

    DATEDIFF(Day,t1.date_and_time,t2.date_and_time) AS running_day_diff,

    DENSE_RANK() OVER(ORDER BY DATEDIFF(Day,t1.date_and_time,t2.date_and_time) DESC) AS Rank

    FROM CTE t1

    INNER JOIN CTE t2 ON t2.rn=t1.rn+1

    ORDER BY running_day_diff DESC,date_and_time_start;

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • Thanks Marks,

    1: Any other possibility without CTE?

    2: What if maximum date GAP required in each year? so that year will be a sort of group key and within each group what is the maximum date gap?

    So now the sample is:

    USE TempDB

    GO

    SET LANGUAGE US_ENGLISH

    SET NOCOUNT ON

    IF OBJECT_ID('#temp_date_gap') IS NOT NULL

    DROP TABLE #temp_date_gap

    CREATE TABLE #temp_date_gap

    (

    date_year varchar(4),

    date_and_time datetime primary key

    )

    INSERT INTO #temp_date_gap VALUES ('2011','2011-05-01 09:00:00.000')

    INSERT INTO #temp_date_gap VALUES ('2011','2011-05-03 09:00:00.000')

    INSERT INTO #temp_date_gap VALUES ('2011','2011-05-06 11:30:00.000')

    INSERT INTO #temp_date_gap VALUES ('2011','2011-06-01 01:00:00.000')

    INSERT INTO #temp_date_gap VALUES ('2011','2011-06-03 00:00:01.000')

    INSERT INTO #temp_date_gap VALUES ('2011','2011-06-20 14:00:00.000')

    INSERT INTO #temp_date_gap VALUES ('2011','2011-08-01 09:00:00.000')

    INSERT INTO #temp_date_gap VALUES ('2011','2011-08-03 09:00:00.000')

    INSERT INTO #temp_date_gap VALUES ('2011','2011-08-06 11:30:00.000')

    INSERT INTO #temp_date_gap VALUES ('2011','2011-09-01 01:00:00.000')

    INSERT INTO #temp_date_gap VALUES ('2011','2011-09-03 00:00:01.000')

    INSERT INTO #temp_date_gap VALUES ('2011','2011-09-20 14:00:00.000')

    INSERT INTO #temp_date_gap VALUES ('2012','2012-05-01 09:00:00.000')

    INSERT INTO #temp_date_gap VALUES ('2012','2012-05-03 09:00:00.000')

    INSERT INTO #temp_date_gap VALUES ('2012','2012-05-06 11:30:00.000')

    INSERT INTO #temp_date_gap VALUES ('2012','2012-06-01 01:00:00.000')

    INSERT INTO #temp_date_gap VALUES ('2012','2012-06-03 00:00:01.000')

    INSERT INTO #temp_date_gap VALUES ('2012','2012-06-20 14:00:00.000')

    INSERT INTO #temp_date_gap VALUES ('2012','2012-08-01 09:00:00.000')

    INSERT INTO #temp_date_gap VALUES ('2012','2012-08-03 09:00:00.000')

    INSERT INTO #temp_date_gap VALUES ('2012','2012-08-06 11:30:00.000')

    INSERT INTO #temp_date_gap VALUES ('2012','2012-09-01 01:00:00.000')

    INSERT INTO #temp_date_gap VALUES ('2012','2012-09-03 00:00:01.000')

    INSERT INTO #temp_date_gap VALUES ('2012','2012-09-20 14:00:00.000')

    select *

    from #temp_date_gap

    Sorry for the requirement changes but its related and i want to keep it on one place.

    Cheers.

  • Per year as below. As for not using CTEs, you could change the CTE to a derived table, but that's pretty much just a cosmetic difference.

    WITH CTE AS (

    SELECT date_year,

    date_and_time,

    ROW_NUMBER() OVER(PARTITION BY date_year ORDER BY date_and_time) AS rn

    FROM #temp_date_gap)

    SELECT t1.date_year,

    t1.date_and_time AS date_and_time_start,

    t2.date_and_time AS date_and_time_end,

    DATEDIFF(Day,t1.date_and_time,t2.date_and_time) AS running_day_diff,

    DENSE_RANK() OVER(ORDER BY DATEDIFF(Day,t1.date_and_time,t2.date_and_time) DESC) AS Rank

    FROM CTE t1

    INNER JOIN CTE t2 ON t2.rn=t1.rn+1

    AND t2.date_year=t1.date_year

    ORDER BY running_day_diff DESC,date_and_time_start;

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • Great!

    For output 1, i still need a drived table OR in-line view to apply where on dense_rank?

    As you know, Dense_rank cannot be use in the same where clause.

    in my scenario, i want to use this logic in a DML. Can you suggest any simple way please.

    Thanks.

  • Lots of options. For your temporary table solution you can do this

    WITH CTE AS (

    SELECT date_year,

    date_and_time,

    ROW_NUMBER() OVER(PARTITION BY date_year ORDER BY date_and_time) AS rn

    FROM #temp_date_gap),

    Results AS (

    SELECT t1.date_year,

    t1.date_and_time AS date_and_time_start,

    t2.date_and_time AS date_and_time_end,

    DATEDIFF(Day,t1.date_and_time,t2.date_and_time) AS running_day_diff,

    DENSE_RANK() OVER(ORDER BY DATEDIFF(Day,t1.date_and_time,t2.date_and_time) DESC) AS Rank

    FROM CTE t1

    INNER JOIN CTE t2 ON t2.rn=t1.rn+1

    AND t2.date_year=t1.date_year)

    SELECT date_year,date_and_time_start,date_and_time_end,running_day_diff,Rank

    FROM Results

    WHERE Rank=1;

    If your data is in a permanent table you could create a view which you can more easily use in any other DML.

    CREATE VIEW dbo.myview

    AS

    WITH CTE AS (

    SELECT date_year,

    date_and_time,

    ROW_NUMBER() OVER(PARTITION BY date_year ORDER BY date_and_time) AS rn

    FROM dbo.mytable)

    SELECT t1.date_year,

    t1.date_and_time AS date_and_time_start,

    t2.date_and_time AS date_and_time_end,

    DATEDIFF(Day,t1.date_and_time,t2.date_and_time) AS running_day_diff,

    DENSE_RANK() OVER(ORDER BY DATEDIFF(Day,t1.date_and_time,t2.date_and_time) DESC) AS Rank

    FROM CTE t1

    INNER JOIN CTE t2 ON t2.rn=t1.rn+1

    AND t2.date_year=t1.date_year;

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • Thanks Mark,

    yes, as i said view or drived tabled are the options for that.

    The thing which i have just noticed was if there is only one date in a year or group, for example, then this logic will not working.

    As i want that if there is only that in a group then return same date with rank 1

    possible?

    Thanks.

  • Changing the INNER JOIN to a LEFT OUTER should work

    WITH CTE AS (

    SELECT date_year,

    date_and_time,

    ROW_NUMBER() OVER(PARTITION BY date_year ORDER BY date_and_time) AS rn

    FROM #temp_date_gap)

    SELECT t1.date_year,

    t1.date_and_time AS date_and_time_start,

    COALESCE(t2.date_and_time,t1.date_and_time) AS date_and_time_end,

    COALESCE(DATEDIFF(Day,t1.date_and_time,t2.date_and_time),0) AS running_day_diff,

    DENSE_RANK() OVER(ORDER BY COALESCE(DATEDIFF(Day,t1.date_and_time,t2.date_and_time),0) DESC) AS Rank

    FROM CTE t1

    LEFT OUTER JOIN CTE t2 ON t2.rn=t1.rn+1

    AND t2.date_year=t1.date_year

    ORDER BY running_day_diff DESC,date_and_time_start;

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • MidBar,

    This version has a very slightly better query plan cost (49%/51%) than the CTE version and may not have the issue you report when 1 record.

    SELECT date_and_time_start

    ,date_and_time_end

    ,DATEDIFF(day,date_and_time_start, date_and_time_end) as running_day_diff

    ,DENSE_RANK() OVER(ORDER BY DATEDIFF(Day, date_and_time_start, date_and_time_end) DESC) AS Rank

    FROM (

    SELECT t1.date_and_time AS date_and_time_start

    ,( SELECT TOP 1 date_and_time

    FROM #temp_date_gap t2

    WHERE t1.date_and_time < t2.date_and_time

    ORDER BY date_and_time) AS date_and_time_end

    FROM #temp_date_gap t1

    ) x

    WHERE date_and_time_end IS NOT NULL

    ORDER BY DATEDIFF(day,date_and_time_start, date_and_time_end) DESC

    Give it a try and let us know.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • MidBar (2/27/2012)


    1: Any other possibility without CTE?

    I have to ask... why didn't you want this with a CTE?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply