Tough Query

  • I'm trying to come up with a way without using cursors to do the following.  Here's the situation.  I  have a table that has a date and some othe data which is not important.  I'm trying to write a query to detemerine if there are more than X consective days for a given date range grouped by a date range.  To make things a little clearer becuase I'm not sure if a understand that either, here's an example.  Say we have a date range that is a month.  Now for each week (every 7 days) in this month does the week have more than 3 dates that are consective (Tuesday, Wedensday, Thursday). 

    You could also think of it as how many times has someone order something for 3 consective days in a week for a month.

    The main problem is trying to find the consecutive dates since teh consecutive dates can float.  I'm trying to find something that well perform quickly as the data grows.  I've thought about adding some helper tables that would get updated for inserts and deletes to help keep count but I was wondering if there were any other ideas.

    Thanks

    Eric

  • Hi Eric,

    Had a quick play with this, I dont think it's something you are going to accomplish with 'straight' sql but I will happily await a couple of the Guru's proving me wrong.....

    This needs tidying but its one way of accomplishing the date grouping problem:

    DECLARE @DateStart as datetime,

     @DateEnd as datetime,

     @MinPK as int,

     @MaxPK as int,

     @BaseDate as datetime,

     @Date2 as datetime,

     @Date3 as datetime

    --Give yourself a date range

    SET @DateStart = CAST('04/01/2004' as datetime)

    SET @DateEnd = CAST('04/30/2004' as datetime)

    --Copy the Key Value and the date values you want to evalute into a hash table

    --This will need doing for each customer

    select PK, OrderDate INTO #TEMPOrders from DateTest

    WHERE OrderDate BETWEEN @DateStart AND @DateEnd

    --Get a minimum and maximum key value range

    SET @MinPK = (SELECT MIN(PK) FROM #TEMPOrders WHERE OrderDate >= @DateStart AND OrderDate <= @DateEnd)

    SET @MaxPK = (SELECT MAX(PK) FROM #TEMPOrders WHERE OrderDate >= @DateStart AND OrderDate <= @DateEnd)

    --Cursorless loop, from the minimum to the maximum

    WHILE @MinPK <= @MaxPK

    BEGIN

     --Set the date variables

     SET @BaseDate = (SELECT OrderDate FROM #TEMPOrders WHERE PK = @MinPK)

     SET @Date2 = (SELECT top 1 OrderDate FROM #TEMPOrders WHERE OrderDate > @BaseDate)

     SET @Date3 = (SELECT top 1 OrderDate FROM #TEMPOrders WHERE OrderDate > @Date2)

     --See if Date 1 and date 2 have a difference of 1 AND date 2 and date 3 have a difference of 1

     --IF they have we have 3 consecutive days

     IF DATEDIFF(dd, @BaseDate, @Date2) = 1 AND DATEDIFF(dd, @Date2, @Date3) = 1

     BEGIN

      --Do whatever you do when you have 3 consecutive days

    PRINT 'Three consecutive dates - ' + CAST(@BaseDate as varchar(20)) + ' to ' + CAST(@Date3 as varchar(20))

      --Increment to the key value for the 3rd date so we dont hit a consecutive 3 day patch

      --eg - 3,4,5 and 6 would hit twice 3 to 5 and 4 to 6

      --May not be necessary, depends on your needs.

      SET @MinPK = (SELECT MIN(PK) FROM #TEMPOrders WHERE PK > @MinPK AND OrderDate = @Date2)

     END

     

     --Increment the end point

     SET @MinPK = (SELECT MIN(PK) FROM #TEMPOrders WHERE PK > @MinPK)

    END

    --Dump the temp table

    DROP TABLE #TEMPOrders

    I just created a table called DateTest with PK column and an OrderDate column to test and play with, modify to suit your table and your needs..... 

    As I said, you will need to loop it for each customer but i'm out of time now matey.

    Have fun

     

    Steve

    We need men who can dream of things that never were.

  • Assuming that you have a temp table with consecutive ids this may get you started :

    Create table dtes( dte datetime , did int)

    insert into dtes(dte, did) values('3/1/2005',  1)

    insert into dtes(dte, did) values('3/2/2005',  2)

    insert into dtes(dte, did) values('3/3/2005',  3)

    insert into dtes(dte, did) values('3/4/2005',  4)

    insert into dtes(dte, did) values('3/5/2005',  5)

    insert into dtes(dte, did) values('3/10/2005', 6)

    insert into dtes(dte, did) values('3/11/2005', 7)

    insert into dtes(dte, did) values('3/18/2005', 8)

    insert into dtes(dte, did) values('3/22/2005', 9)

    select d.dte, dd.Cnt

    from

    dtes d join

     (select dte - did  diff,  count (*) Cnt

     from dtes

     where dte between '3/1/2005' and '3/22/2005'

     group by dte - did ) dd

    on dd.dte - d.did  = dd.diff and dd.Cnt > 1

    HTH

     


    * Noel

  • Hi Noel,

    New someone would do it.....

    But - does it accomplish the original request?

    Correct me if i'm wrong - As far as i can see Eric is after a count of the number of three day groups in between a given date range to evaluate something, not all days that are in a group of three or more returning.

    Nice one Noel

    Speak to you soon

    Have fun

    Steve

    We need men who can dream of things that never were.

  • Steve,

    I just posted a sample in which you get the dates and how many were consecutives. This is not a final query becuse the actual output was really not specified. I you just output counts do you really don't need to know when ?

    Anyways this example can be modified by the poster at will

     

    HTH

     

     

     


    * Noel

  • Hi Noel,

    I know what you mean - the job spec isn't very clear.  Every time I read it I come up with a slightly different set of requirements. I'm sure there is something in there about picking how many sets of 3 consecutive dates, in however many weeks there are between the date range, then in a month (or between a date range).

    Anyway - There should be enough there to go on from.

    All the best mate.

    Have fun

    Steve

    We need men who can dream of things that never were.

  • Not sure if this of any use, but it should show the first of 3 consecutive dates that are Mon-Fri. This is written off the top of my head with no testing whatsoever

    SELECT a.[date]

    FROM

    a

    INNER JOIN

    b ON b.[date] = a.[date]+1

    INNER JOIN

    c ON c.[date] = a.[date]+2

    WHERE a.[date] BETWEEN @startdate AND @enddate

    AND DATEPART(dw,a.[date]) >= 1

    AND DATEPART(dw,c.[date]) <= 5

    Far away is close at hand in the images of elsewhere.
    Anon.

  • Hi David,

    Works a treat, no problems - change the 5 to a 7 and it picks the ones running into weekends.

    Have fun

     

    Steve

    We need men who can dream of things that never were.

  • Thanks for all the great ideas.  I should be able to get something working. 

    Thanks again

    Eric

  • You could try this (of course, some limitations apply):

     

    Create table dates( date datetime)

    insert into dates values('2005-01-01')

    insert into dates values('2005-01-03')

    insert into dates values('2005-01-04')

    insert into dates values('2005-01-07')

    insert into dates values('2005-01-08')

    insert into dates values('2005-01-09')

    insert into dates values('2005-01-12')

    insert into dates values('2005-01-13')

    insert into dates values('2005-01-14')

    insert into dates values('2005-01-15')

    insert into dates values('2005-01-20')

    insert into dates values('2005-01-21')

    insert into dates values('2005-01-22')

    insert into dates values('2005-01-23')

    insert into dates values('2005-01-24')

    -- select * from dates

     

    DECLARE @StartDate datetime, @EndDate datetime, @DayPeriod int, @ConsecDays int, @mask varchar(20)

    SELECT @StartDate = '2005-01-01', @EndDate = '2005-01-31', @DayPeriod = 10, @ConsecDays = 3

    SELECT @mask = LTRIM( RTRIM(REPLACE(POWER(10, @ConsecDays) - 1, '9', '1')))

    SELECT PeriodNumber, MIN(date) as First_time_in_Period, MAX(date) as Last_time_in_Period,

    COUNT(*) as Times_in_Period, CAST (SUM(days) as varchar ) Days_map, @mask as mask,

    CHARINDEX (@mask, CAST (SUM(days) as varchar)) as First_day_in_Days_map

    FROM (SELECT date, DATEDIFF(day, @StartDate, date) / @DayPeriod as PeriodNumber,

     POWER(10, DATEDIFF(day, @StartDate, date) % @DayPeriod) as days

     FROM dates

     WHERE date between @StartDate and @EndDate) as p

    GROUP BY PeriodNumber

    HAVING COUNT(*) >= @ConsecDays

    AND CHARINDEX (@mask, CAST (SUM(days) as varchar)) > 0

     

     

    Igor

  • It seems that there is an easy solution.

    The idea is to use induction as follows: if I know when two dates are contiguos then I should apply that query to find out if three dates a contiguous and so on. For example, if the question is to find out all three consecutive dates then assuming my table looks like this:create table dbo.testdt(rid int identity(1,1), xdt datetime). Then, the following query returns all sets of three neighboring rows. By the way, the neighboorhood in this particular case is a date field but this could also be generalized to money, or even categorical data.

    The query for the date case is:

    select A.Arid,A.FirstDt, A.Brid, A.SecondDt, C.rid as Crid, C.xdt as ThirdDt

    from (select a.rid as Arid,a.xdt as FirstDt, b.rid as Brid, b.xdt as SecondDt

    from testdt a,testdt b

    where datediff(day,a.xdt,b.xdt)=1

    ) A, TestDt C

    where datediff(day, a.seconddt,c.xdt)=1

    If the number of consecutive is also variable, then the query can be generated mechanically via a stored procedure.

     

     

     

     

  • I am using this example as previously posted

    Create table dates( date datetime)

    insert into dates values('2005-01-01')

    insert into dates values('2005-01-03')

    insert into dates values('2005-01-04')

    insert into dates values('2005-01-07')

    insert into dates values('2005-01-08')

    insert into dates values('2005-01-09')

    insert into dates values('2005-01-12')

    insert into dates values('2005-01-13')

    insert into dates values('2005-01-14')

    insert into dates values('2005-01-15')

    insert into dates values('2005-01-20')

    insert into dates values('2005-01-21')

    insert into dates values('2005-01-22')

    insert into dates values('2005-01-23')

    insert into dates values('2005-01-24')

    This will tell me if that week of the year has 3 or more days consecutive in it.

    select

    datepart(wk, [date]) wk,

    (CASE

     WHEN (sum((power(2,datepart(dw, [date]))/2)) & 7) = 7 THEN 1

     WHEN (sum((power(2,datepart(dw, [date]))/2)) & 14) = 14 THEN 1

     WHEN (sum((power(2,datepart(dw, [date]))/2)) & 28) = 28 THEN 1

     WHEN (sum((power(2,datepart(dw, [date]))/2)) & 56) = 56 THEN 1

     WHEN (sum((power(2,datepart(dw, [date]))/2)) & 112) = 112 THEN 1

     ELSE 0 END) xx

    from

    dates

    group by

    datepart(wk, [date])

    However from what you said are you looking for day 1 thru day 7 to count as week one if you are lookign at say 1/1/2005 - 1/31/2005

    Or are you wanting to count 1/1/2005 the begining of the first week?

    If based on the range you want the start date to be the first date then try something like this

    select

    datepart(wk, dateadd(d,-(datepart(dw,@startdate) - 1),[date])) wk,

    (CASE

     WHEN (sum((power(2,datepart(wk, dateadd(d,-(datepart(dw,@startdate) - 1),[date])))/2)) & 7) = 7 THEN 1

     WHEN (sum((power(2,datepart(wk, dateadd(d,-(datepart(dw,@startdate) - 1),[date])))/2)) & 14) = 14 THEN 1

     WHEN (sum((power(2,datepart(wk, dateadd(d,-(datepart(dw,@startdate) - 1),[date])))/2)) & 28) = 28 THEN 1

     WHEN (sum((power(2,datepart(wk, dateadd(d,-(datepart(dw,@startdate) - 1),[date])))/2)) & 56) = 56 THEN 1

     WHEN (sum((power(2,datepart(wk, dateadd(d,-(datepart(dw,@startdate) - 1),[date])))/2)) & 112) = 112 THEN 1

     ELSE 0 END) xx

    from

    dates

    group by

    datepart(wk, dateadd(d,-(datepart(dw,@startdate) - 1),[date]))

    If instead it is Jan 1 2005 you want to be the start of the first week and a week being 7 so Jan 1 - Jan 7 being week 1, Jan 8 - being 2 and so on then try.

    select

    datepart(wk, dateadd(d,-(datepart(dw,'20050101') - 1),[date])) wk,

    (CASE

     WHEN (sum((power(2,datepart(wk, dateadd(d,-(datepart(dw,'20050101') - 1),[date])))/2)) & 7) = 7 THEN 1

     WHEN (sum((power(2,datepart(wk, dateadd(d,-(datepart(dw,'20050101') - 1),[date])))/2)) & 14) = 14 THEN 1

     WHEN (sum((power(2,datepart(wk, dateadd(d,-(datepart(dw,'20050101') - 1),[date])))/2)) & 28) = 28 THEN 1

     WHEN (sum((power(2,datepart(wk, dateadd(d,-(datepart(dw,'20050101') - 1),[date])))/2)) & 56) = 56 THEN 1

     WHEN (sum((power(2,datepart(wk, dateadd(d,-(datepart(dw,'20050101') - 1),[date])))/2)) & 112) = 112 THEN 1

     ELSE 0 END) xx

    from

    dates

    group by

    datepart(wk, dateadd(d,-(datepart(dw,'20050101') - 1),[date]))

    Then it is a matter of dealing with shifting choices of days or more consecutive to match.

    In the above I do the following with a bitmask to get 3 or more

    1 = Day 1 Or Sunday

    2 = Day 2 Or Monday

    4 = Day 3 Or Tuesday

    8 = Day 4 Or Wednesday

    16 = Day 5 Or Thursday

    32 = Day 6 Or Friday

    64 = Day 7 Or Saturday

    So to see if at least Sunday, Monday and Tuesday were in the set I check for the total bitmask of 1 + 2 + 4 or 7 in the generated mask.

    Now there may be a bit more you need but this should be able to be altered for your specific needs.

    Just another option.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply