gaps and overlaps... here''s a good one.

  • Here's the table in question:

    PRICE_HISTORY

    prodid int,

    price float,

    timefrom datetime,

    timethru datetime

    The most recent record (ie, greatest TimeFrom) will have a null TimeThru, otherwise, every TimeThru corresponds to the next record's TimeFrom (for the same prodid)

    here's an example of 2 consecutive records:

    prodid, price, timefrom, timethru

    ------------

    4, 4.23, 1/1/2004 12:08:34.030, 1/4/2004 23:54:32.100

    4, 4.15, 1/4/2004 23:54:32.100, 1/5/2004 00:12:34.567

    4, 4.20, 1/5/2004 00:12:34.567, 1/12/2004 12:01:00.098

    That's how it SHOULD look. I'm suspicious that there are gaps in some of the records (where there is no price for a given prodid/time). I'm also suspicious that there are overlaps in price (where there is more than one price for a given prodid/time).

    I need to generate queries that will confirm or deny these 2 suspicions. Who wants a gold star?

  • You need to compare adjacent records for each ProdID, which requires some method to add a rank or sequence to the time series.

    In Yukon/2005, there are new SQL functions, including RANK() OVER which provide this. In earlier versions, you need to create a #temp table or @Table variable with an Identity column, and insert into this with an order by. Your data qulaity queries are then a series of self-joins on the temp table or variable, looking for mismatches between adjacent records.

    Declare @SortedPrice Table ( Rank int identity,

      prodid int,

      price float,

      timefrom datetime,

      timethru datetime )

    Insert into @SortedPrice (prodid, price, timefrom, timethru)

    Select prodid, price, timefrom, timethru

    from price_history

    Order By prodid, timefrom

     

    This query will locate records (in t1) where the end date does not match the start date in the adjacent record (in t2).

    Select t1.*, t2.*

    From @SortedPrice As t1

    Inner Join @SortedPrice As t2

      On ( t1.prodid = t2.prodid and

           t1.Rank = t2.Rank - 1 )

    Where t1.timethru <> t2.timefrom

     

    Other queries are essentially the same self Inner Join, just with different Where clauses to look for different data issues between adjacent records.

     

  • Danke Zehn PW. That helps a lot. Looking forward to Yukon even more now.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply