Getting first and last value for every group.

  • I have the some data in one of a temporary table inside a stored procedure. I want to take the first and last value for every group and the number of records for every group. For example if you run the following script you will get a set of result.

    create table #sampledata

    (

    userId int,

    BaseYear int,

    TotalSales float

    )

    insert into #sampledata

    select 1,2008,25000 union

    select 1,2009,22500 union

    select 1,2010,23400 union

    select 1,2011,26700 union

    select 1,2012,25200 union

    select 2,2008,25050 union

    select 2,2009,22510 union

    select 2,2010,23470 union

    select 2,2011,26200 union

    select 3,2008,53504 union

    select 3,2008,43504 union

    select 3,2008,25504 union

    select 3,2008,23404 union

    select 3,2008,25504 union

    select 3,2008,23804

    select * from #sampledata

    drop table #sampledata

    From that set of result I want to get the data in the following format:

    userId StartValue EndValue Records

    1 25000 25200 5

    2 25050 26200 4

    3 53504 23804 6

    I am just not able to write the correct query. Can anyone suggest a solution?

  • Your results indicate that you want first and last by insert order. the table is a heap and therefore has no order.

    You will need to add an IDENTITY column to add order to the sequence of inserts.

    *Edited* To strike out the drivel as pointed out by Sean below 🙂

    Far away is close at hand in the images of elsewhere.
    Anon.

  • Excellent job posting ddl and sample data. The challenge here is that you have to define what "first" and "last" means. A table by definition is an unordered set of data. In your case you need to define what column(s) can be used to define the order. From there it is pretty simple.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • David Burrows (5/30/2014)[hrthe table is a heap and therefore has no order.

    Tables have no order if it is a heap or not. A clustered index does not provide order to the table.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • At first sight, I though you needed just MAX and MIN, but looking at the expected results I found out that it was a little more complicated.

    Here's an option:

    WITH CTE AS(

    select *,

    ROW_NUMBER() OVER(PARTITION BY userID ORDER BY BaseYear) rnasc,

    ROW_NUMBER() OVER(PARTITION BY userID ORDER BY BaseYear DESC) rndesc

    from #sampledata

    )

    SELECT userId,

    MAX( CASE WHEN rnasc = 1 THEN TotalSales END) AS StartValue,

    MAX( CASE WHEN rndesc = 1 THEN TotalSales END) AS EndValue,

    COUNT(*) AS Records

    FROM CTE

    GROUP BY userId

    Note that your sample data didn't gave the exact results because you used UNION instead of UNION ALL, so you were losing duplicates. You should know as well that with group 3 there's no guarantee of the exact results because there's nothing to order by as base year is always the same.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Sean Lange (5/30/2014)


    David Burrows (5/30/2014)[hrthe table is a heap and therefore has no order.

    Tables have no order if it is a heap or not. A clustered index does not provide order to the table.

    True. My bad :blush:

    Meant to only state that the data had no implied order.

    Far away is close at hand in the images of elsewhere.
    Anon.

  • Luis Cazares (5/30/2014)


    Note that your sample data didn't gave the exact results because you used UNION instead of UNION ALL, so you were losing duplicates. You should know as well that with group 3 there's no guarantee of the exact results because there's nothing to order by as base year is always the same.

    That is why I did not post the solution you posted 😀

    Far away is close at hand in the images of elsewhere.
    Anon.

  • David Burrows (5/30/2014)


    Sean Lange (5/30/2014)


    David Burrows (5/30/2014)[hrthe table is a heap and therefore has no order.

    Tables have no order if it is a heap or not. A clustered index does not provide order to the table.

    True. My bad :blush:

    Meant to only state that the data had no implied order.

    I know that you know that. I felt it important to state though in case others wander in here someday. 😀

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (5/30/2014)


    I felt it important to state though in case others wander in here someday. 😀

    And quite right too! 🙂

    Far away is close at hand in the images of elsewhere.
    Anon.

  • If you want your results based on insert order and you add IDENTITY column you would end up with something like this

    CREATE TABLE #sampledata

    (

    RowID int IDENTITY(1,1),

    userId int,

    BaseYear int,

    TotalSales float

    )

    INSERT INTO #sampledata (userId,BaseYear,TotalSales)

    SELECT 1,2008,25000 UNION ALL

    SELECT 1,2009,22500 UNION ALL

    SELECT 1,2010,23400 UNION ALL

    SELECT 1,2011,26700 UNION ALL

    SELECT 1,2012,25200 UNION ALL

    SELECT 2,2008,25050 UNION ALL

    SELECT 2,2009,22510 UNION ALL

    SELECT 2,2010,23470 UNION ALL

    SELECT 2,2011,26200 UNION ALL

    SELECT 3,2008,53504 UNION ALL

    SELECT 3,2008,43504 UNION ALL

    SELECT 3,2008,25504 UNION ALL

    SELECT 3,2008,23404 UNION ALL

    SELECT 3,2008,25504 UNION ALL

    SELECT 3,2008,23804

    ;WITH cte (userId,TotalSales,RowID,MinID,MaxID)

    AS (SELECT userId,TotalSales,RowID,

    MIN(RowID) OVER (PARTITION BY userId),

    MAX(RowID) OVER (PARTITION BY userId)

    FROM #sampledata)

    SELECT userId,

    MAX(CASE WHEN RowID = MinID THEN TotalSales END) AS [StartValue],

    MAX(CASE WHEN RowID = MaxID THEN TotalSales END) AS [EndValue],

    COUNT(*) AS [Records]

    FROM cte

    GROUP BY userId

    Far away is close at hand in the images of elsewhere.
    Anon.

  • Since this is a 2012 thread, how about the window functions?

    😎

    create table #sampledata

    (

    userId int,

    BaseYear int,

    TotalSales float

    )

    insert into #sampledata

    select 1,2008,25000 union

    select 1,2009,22500 union

    select 1,2010,23400 union

    select 1,2011,26700 union

    select 1,2012,25200 union

    select 2,2008,25050 union

    select 2,2009,22510 union

    select 2,2010,23470 union

    select 2,2011,26200 union

    select 3,2008,53504 union

    select 3,2008,43504 union

    select 3,2008,25504 union

    select 3,2008,23404 union

    select 3,2008,25504 union

    select 3,2008,23804

    ;WITH S_LIST AS

    (

    SELECT

    ROW_NUMBER() OVER

    (

    PARTITION BY SD.userId

    ORDER BY (SELECT NULL)

    ) AS RID

    ,SD.userId

    ,FIRST_VALUE(SD.BaseYear) OVER

    (

    PARTITION BY SD.userId

    ORDER BY SD.TotalSales ASC

    ROWS BETWEEN UNBOUNDED PRECEDING

    AND UNBOUNDED FOLLOWING

    ) AS FIRST_BaseYear

    ,LAST_VALUE(SD.BaseYear) OVER

    (

    PARTITION BY SD.userId

    ORDER BY SD.TotalSales ASC

    ROWS BETWEEN UNBOUNDED PRECEDING

    AND UNBOUNDED FOLLOWING

    ) AS LAST_BaseYear

    ,FIRST_VALUE(SD.TotalSales) OVER

    (

    PARTITION BY SD.userId

    ORDER BY SD.BaseYear ASC

    ROWS BETWEEN UNBOUNDED PRECEDING

    AND UNBOUNDED FOLLOWING

    ) AS FIRST_TotalSales

    ,LAST_VALUE(SD.TotalSales) OVER

    (

    PARTITION BY SD.userId

    ORDER BY SD.BaseYear ASC

    ROWS BETWEEN UNBOUNDED PRECEDING

    AND UNBOUNDED FOLLOWING

    ) AS LAST_TotalSales

    FROM #sampledata SD

    )

    SELECT

    *

    FROM S_LIST SL

    WHERE SL.RID = 1

    drop table #sampledata

    Results

    RID userId FIRST_BaseYear LAST_BaseYear FIRST_TotalSales LAST_TotalSales

    ---- ------- -------------- ------------- ----------------- ----------------

    1 1 2009 2011 25000 25200

    1 2 2009 2011 25050 26200

    1 3 2008 2008 23404 53504

  • The problem with my data is it comes as a result of a lot of calculation in a stored procedure. In the end there is only one temporary table that holds the record. Now, the requirement was to also find the first record and last record for every group(not min and max). So instead of making changes everywhere in the statements I wanted to get the required values from the available result set itself. your solution worked just fine. Thanks a lot.

  • The problem with my data is it comes as a result of a lot of calculation in a stored procedure. In the end there is only one temporary table that holds the record. Now, the requirement was to also find the first record and last record for every group(not min and max). So instead of making changes everywhere in the statements I wanted to get the required values from the available result set itself. your solution worked just fine. Thanks a lot.

  • zafarthesultan (6/2/2014)


    The problem with my data is it comes as a result of a lot of calculation in a stored procedure. In the end there is only one temporary table that holds the record. Now, the requirement was to also find the first record and last record for every group(not min and max). So instead of making changes everywhere in the statements I wanted to get the required values from the available result set itself. your solution worked just fine. Thanks a lot.

    The order in which rows are stored is un-predictable. Just to give you a feeling for what I mean, in your script for loading data change all occurences of 'union' to 'union all'. Then run it and do a select * from #sampledata. You will find the order (especially for the userid=3 group) quite different. So in your opinion, in what order did the rows arrive?

  • Michael Meierruth (6/3/2014)


    zafarthesultan (6/2/2014)


    The problem with my data is it comes as a result of a lot of calculation in a stored procedure. In the end there is only one temporary table that holds the record. Now, the requirement was to also find the first record and last record for every group(not min and max). So instead of making changes everywhere in the statements I wanted to get the required values from the available result set itself. your solution worked just fine. Thanks a lot.

    The order in which rows are stored is un-predictable. Just to give you a feeling for what I mean, in your script for loading data change all occurences of 'union' to 'union all'. Then run it and do a select * from #sampledata. You will find the order (especially for the userid=3 group) quite different. So in your opinion, in what order did the rows arrive?

    That is why I suggested adding IDENTITY column as this will give the INSERT order which what was indicated in the first post. 😀

    Far away is close at hand in the images of elsewhere.
    Anon.

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply