Select By Date Groups

  • Hi I came across this question the other day and I didn't know the correct approach. Each product has consecutive start and end dates with the end date a day after. How do you retrieve the count of products when the start date is not the next day.


    CREATE TABLE products (id int, product_name varchar(25), start_date date,end_date date);

    INSERT INTO products (id,product,start_date,end_date)
    VALUES (1,product1, 20-04-2018, 21-04-2018),
    INSERT INTO products (id,product,start_date,end_date)
     VALUES(2,product1, 22-04,2018,23-04-2018),
    INSERT INTO products (id,product,start_date,end_date)
     VALUES(3,product1,24-04-2018,25-04-2018),
    INSERT INTO products (id,product,start_date,end_date)
     VALUES 
    (4
    ,product2, 20-04-2018, 21-04-2018),
    INSERT INTO products (id,product,start_date,end_date)
     VALUES(5,product2, 22-04,2018,23-04-2018),
    INSERT INTO products (id,product,start_date,end_date)
     VALUES(6,product2,24-04-2018,25-04-2018),

    INSERT INTO products (id,product,start_date,end_date)
     VALUES 
    (7,product3, 20-04-2018, 21-04-2018),
    INSERT INTO products (id,product,start_date,end_date)
     VALUES(8,product3, 22-04,2018,23-04-2018),
    INSERT INTO products (id,product,start_date,end_date)
     VALUES(9,product3,24-04-2018,25-04-2018),

    INSERT INTO products (id,product,start_date,end_date)
     VALUES 
    (10,product1, 26-04-2018, 27-04-2018),
    INSERT INTO products (id,product,start_date,end_date)
     VALUES(11,product1, 28-04,2018,29-04-2018),
    INSERT INTO products (id,product,start_date,end_date)
     VALUES 
    (12,product2, 27-04-2018, 28-04-2018),
    INSERT INTO products (id,product,start_date,end_date)
     VALUES(13,product2, 29-04,2018,30-04-2018),
    INSERT INTO products (id,product,start_date,end_date)
     VALUES(14,product3,29-04-2018,30-04-2018),
    INSERT INTO products (id,product,start_date,end_date)
     VALUES (
    15,product1, 20-04-2017, 21-04-2017),

    Expected result is 
     Product1 , 3
    Product2, 2
    Product3, 2

    You can see that each product has date range with consecutive days that is what should the count be based on. e.g product1 20-04-2018 to 25-04-2018 then 26-04-2018 to 29-04-2018 then 20-04-2017 to 21-04-2017 give a count of 3 🙂

    Thanks

  • Can you fix the sample data set and post the expected results please?
    😎

  • I'm most interested in WHY product1 has a count of 3.   Is it simply the maximum number of consecutive days that said product appears in the data?  If I merely COUNT the number of rows for product1, it's a lot more than 3.

  • sgmunson - Wednesday, May 23, 2018 1:14 PM

    I'm most interested in WHY product1 has a count of 3.   Is it simply the maximum number of consecutive days that said product appears in the data?  If I merely COUNT the number of rows for product1, it's a lot more than 3.

    I think I figured out how the given results were determined:


    IF OBJECT_ID('[dbo].[products]','U') IS NOT NULL
      DROP TABLE [dbo].[products];

    CREATE TABLE [dbo].[products] (
      [id] int
      , [product_name] varchar(25)
      , [start_date] date
      , [end_date] date
    );


    INSERT INTO [dbo].[products] (id,product_name,start_date,end_date)
    VALUES ( 1, 'product1', '2018-04-20', '2018-04-21'),
           ( 2, 'product1', '2018-04-22', '2018-04-23'),
           ( 3, 'product1', '2018-04-24', '2018-04-25'),
           ( 4, 'product2', '2018-04-20', '2018-04-21'),
           ( 5, 'product2', '2018-04-22', '2018-04-23'),
           ( 6, 'product2', '2018-04-24', '2018-04-25'),
           ( 7, 'product3', '2018-04-20', '2018-04-21'),
           ( 8, 'product3', '2018-04-22', '2018-04-23'),
           ( 9, 'product3', '2018-04-24', '2018-04-25'),
           (10, 'product1', '2018-04-26', '2018-04-27'),
           (11, 'product1', '2018-04-28', '2018-04-29'),
           (12, 'product2', '2018-04-27', '2018-04-28'),
           (13, 'product2', '2018-04-29', '2018-04-30'),
           (14, 'product3', '2018-04-29', '2018-04-30'),
           (15, 'product1', '2017-04-20', '2017-04-21');
    GO

    WITH base AS (
    SELECT
      [p].[product_name]
      , [grp] = [p].[id] - ROW_NUMBER() OVER (PARTITION BY [p].[product_name] ORDER BY [p].[id], [p].[start_date])
    FROM [dbo].[products] AS [p]
    )
    SELECT
      .[product_name]
      , [Cnt] = COUNT(DISTINCT .[grp])
    FROM
      [base] AS
    GROUP BY
      .[product_name]
    ORDER BY
      .[product_name];

    GO

    IF OBJECT_ID('[dbo].[products]','U') IS NOT NULL
      DROP TABLE [dbo].[products];

    GO

  • Thanks for the reply. 


    Why are you minus id from row number?
    [p].[id] - ROW_NUMBER() OVER (PARTITION BY [p].[product_name] ORDER BY [p].[id], [p].[start_date])

  • ringovski - Wednesday, May 23, 2018 5:45 PM

    Thanks for the reply. 


    Why are you minus id from row number?
    [p].[id] - ROW_NUMBER() OVER (PARTITION BY [p].[product_name] ORDER BY [p].[id], [p].[start_date])

    Based on the data, that was how I could group the data together and generate the expected results.  Best whay to see what is going on is to start pulling the query apart and see the pieces used to get to the final solution.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply