Sum a Column by Client and Group by date 4 weeks apart!

  • Hi,

    I ma using ms sql server 2010 and vb.net 2010.

    I have a simple table that has ID, customersID, WeekEndingDate, WorkedHours,...and a few other columns.

    The WeekEndingdate will have all 52 weeks of the year for all clients in it.

    What I want to be able to do is group the dates into 4 week groups and sum the WorkedHours for each client.

    Assuming that every week end date is in the table I should get out

    Client 1234, 20/04/2014, 200 Hours... for 13 4 week periods for each client.

    Which is the best way to do it?

    Thanking you

  • Hi,

    This sounds relatively simple, but there's a problem. We need something to work with. Please post DDL, sample data (as insert statements) and expected output based on the sample data.

    Read the article posted in my signature to find out information on how to do it.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Hi,

    Thanks for your prompt reply.

    I think the data is very simple really.

    ClientID's are Integers

    WeekEndingDate is a date , there is 52 weekly dates for each Client.

    WorkedHours is a Double

    I want to sum the WorkedHours by Client by 4 weeks group.

    Its grouping the dates thats difficult as I want the output to have the highest date for each grouping .

    So final output should be 13 roes for each client.with the WekkeEdning Date being 4 weeks apart.

    Trust that clear up any confusion. I dont think DDL etc woudl make it any clearer in this scenario.

    kind regards

  • Without sample data and DDL , no-one is going to be able to provide a working solution.

    So, I suggest you use a calendar table, with every date you are interested in (don't forget to go back in time as well as forward just to ensure you can use the query on old data) and it's associated "4 week period".

    Then it is a simple matter of joining to that "calendar" table by date and grouping by "4 week period".

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • Hi MM,

    Thansks for your suggestions.

    Thats what I was thinking myself a calendar table with just 2 columns something like...

    MonthID, WeekEndate

    1 , 7/1/14

    1 , 14/1/14

    1 , 21/1/14

    1 , 28/1/14

    2 , 5/2/14

    2 , 12/2/14

    and so on...

    then join it to my table and group by MonthID and CustomerID.

    But I was thinking I cant be the first person to do this... so maybe someone else has a better way of doing it...

    kind regards

  • This is a possible solution but I'm not sure that it will work as it is for your data because I can't test it. This won't need a calendar table or additional columns, but it might not be exactly what you need and you'll have to work with it.

    CREATE TABLE Sometable(

    CustomersID int,

    WeekEndingDate date,

    WorkedHours decimal( 18,8));

    WITH CTE AS(

    SELECT customersID,

    WeekEndingDate,

    (DENSE_RANK() OVER( PARTITION BY customersID ORDER BY WeekEndingDate) - 1 )/ 4 drankdrank,

    WorkedHours

    FROM SomeTable

    )

    SELECT customersID,

    MIN( WeekEndingDate),

    MAX( WeekEndingDate),

    SUM( WorkedHours)

    FROM CTE

    GROUP BY drank

    DDL would have been simpler than the explanation, as you can see in the example. Sample data and expected results would have helped to validate that the code is correct instead of a wild guess.

    This will work with SQL Server 2008 & 2012 which I suppose that you're using as there's no 2010.

    Double doesn't exists in SQL Server, so I used a decimal type.

    EDIT: Code corrected.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • ;WITH cte (CustomerID,WeekEndingDate,WorkedHours,WeekNo) AS (

    SELECT CustomerID,WeekEndingDate,WorkedHours,CEILING(ROW_NUMBER() OVER (PARTITION BY CustomerID ORDER BY WeekEndingDate) / 4.0)

    FROM

    )

    SELECT CustomerID,MAX(WeekEndingDate),SUM(WorkedHours)

    FROM cte

    GROUP BY CustomerID,WeekNo

    ORDER BY CustomerID ASC,2 ASC

    Far away is close at hand in the images of elsewhere.
    Anon.

  • Hi Guys, thanks for your reply, it looks like something that would work the bill.

    Everything was going grand until I read

    DENSE_RANK() OVER( PARTITION BY WeekEndingDate ORDER BY customersID) - 1 )/ 4 drank

    I am familiar with most of t-sql but an explanation of above would help...

    What I want out is three columns as follows;

    customerID , Max(4WeekEndDate), Sum(WorkedHours)

    123456 , 30/1/14 , 200

    123456 , 28/2/14 , 145.5

    321321 , 30/1/14 , 311

    321321 , 28/2/14 , 125.5

    So while my original table had 52 weekly entries for each client with hours worked for each week.

    So if I had 10 clients there woudl be 520 rows of data.

    My final table should have 13 rows for each client thus onyl 130 rows and each row would have the max date for that particular 4 week period...

    All suggestions appreciated.

    kind regards

    I take your point about the DDL but I think it doesnt give the full story...

  • This is why I wanted sample data. That way I could try the code. :hehe:

    The problem is that I used the columns wrong in the OVER clause. It should read as follows:

    DENSE_RANK() OVER( PARTITION BY customersID ORDER BY WeekEndingDate) - 1 )/ 4 drank

    I'll correct my original post.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Have a look at these articles and see if they will help.

    http://jasonbrimhall.info/2010/02/13/payperiod-ii/

    Though the schemas are different, it seems to be the same general issue.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Tallboy (5/27/2014)


    What I want out is three columns as follows;

    customerID , Max(4WeekEndDate), Sum(WorkedHours)

    123456 , 30/1/14 , 200

    123456 , 28/2/14 , 145.5

    321321 , 30/1/14 , 311

    321321 , 28/2/14 , 125.5

    I thought that is what my solution did :blink:

    It does not matter if you replace the CEILING/ROW_NUMBER with DENSE_RANK it will produce the same result in the end.

    Far away is close at hand in the images of elsewhere.
    Anon.

  • Hi Guys, Thanks for all the excellent idea's.

    On further thought and examination of my needs , I think I need a stored procedure in which I pass in a StartDate ( a pay period ending date) and a StaffID number.( a pay period consist of 4 weeks!)

    I have three simple tables as follows

    table1 (StaffID, WeekendingDate, WorkedHours)

    table2 (StaffID, WeekendingDate, ALHours)

    table3 (StaffID, WeekendingDate, SLHours)

    The StaffID will be in table1 but not every staff member will have an entery in table2 or table3.

    So when the procedure receives that StaffID and DAte it queries each table and extracts a total WorkedHours, ALHours and SLHours (if data in table 2 and table 3 exits else nulls) between StartDate and StartDate- 23 Days!.

    Then JOIN's each table by StaffID and returns a dataset I can display in a datagridview in vb.net !

    I know this is different from what I initially suggested but this is because some Staff may have 4 weeks data in table1 and some may only have 1 weeks data, also they may have no data in the other 2 tables.

    I hope that all makes sense, if not let me know.

    kind regards

  • SELECT t1.StaffID,

    SUM(t2.ALHours) AS [ALHours],

    SUM(t3.SLHours) AS [SLHours]

    FROM table1 t1

    LEFT JOIN table2 t2

    ON t2.StaffID = t1.StaffID

    AND t2.WeekendingDate = t1.WeekendingDate

    LEFT JOIN table3 t2

    ON t3.StaffID = t1.StaffID

    AND t3.WeekendingDate = t1.WeekendingDate

    WHERE t1.StaffID = @StaffID

    AND t1.WeekendingDate BETWEEN DATEADD(day,-23,@StartDate) AND @StartDate

    Far away is close at hand in the images of elsewhere.
    Anon.

  • Hi David,

    Thank you so much for your answer, it looks like it will fit the bill alright

    I just have one amendment I cant be sure that the dates in the tables will line up exactly .

    In a perfect world they will but in reality users a rushing and type a slightly different date, so I would prefer a kind if catch all between 2 dates. So my suggestion would be something like this...

    SELECT t1.StaffID,

    SUM(t2.ALHours) AS [ALHours],

    SUM(t3.SLHours) AS [SLHours]

    FROM table1 t1

    LEFT JOIN table2 t2

    ON t2.StaffID = t1.StaffID

    AND t2.WeekendingDate BETWEEN DATEADD(day,-23,@StartDate) AND @StartDate

    LEFT JOIN table3 t2

    ON t3.StaffID = t1.StaffID

    AND t3.WeekendingDate BETWEEN DATEADD(day,-23,@StartDate) AND @StartDate

    WHERE t1.StaffID = @StaffID

    AND t1.WeekendingDate BETWEEN DATEADD(day,-23,@StartDate) AND @StartDate

    Would you agree?

    Also

  • That might give you a semi cross join which will lead to incorrect results. Date calculations can prevent that but you might get some inconsistencies. Check the following:

    CREATE TABLE #table1 (StaffID int, WeekendingDate date, WorkedHours int)

    CREATE TABLE #table2 (StaffID int, WeekendingDate date, ALHours int)

    CREATE TABLE #table3 (StaffID int, WeekendingDate date, SLHours int)

    INSERT INTO #table1 VALUES

    (1,'20140504', 40),

    (1,'20140511', 40),

    (1,'20140518', 40)

    INSERT INTO #table2 VALUES

    (1,'20140503', 30),

    (1,'20140510', 30),

    (1,'20140517', 30)

    INSERT INTO #table3 VALUES

    (1,'20140504', 20),

    (1,'20140511', 20),

    (1,'20140518', 20)

    DECLARE @StartDate date = GETDATE(),

    @StaffID int = 1

    SELECT t1.StaffID,

    SUM(t2.ALHours) AS [ALHours],

    SUM(t3.SLHours) AS [SLHours]

    FROM #table1 t1

    LEFT JOIN #table2 t2

    ON t2.StaffID = t1.StaffID

    AND t2.WeekendingDate BETWEEN DATEADD(day,-23,@StartDate) AND @StartDate

    LEFT JOIN #table3 t3

    ON t3.StaffID = t1.StaffID

    AND t3.WeekendingDate BETWEEN DATEADD(day,-23,@StartDate) AND @StartDate

    WHERE t1.StaffID = @StaffID

    AND t1.WeekendingDate BETWEEN DATEADD(day,-23,@StartDate) AND @StartDate

    GROUP BY t1.StaffID

    GO

    DROP TABLE #table1

    DROP TABLE #table2

    DROP TABLE #table3

    The best you could do is to add a CHECK constraint to your WeekendingDate column. That will ensure that all dates will be correct and you can join by date. Something like this:

    CREATE TABLE Test(

    TestDate date CHECK (DATEPART(dw, TestDate) = 7))

    --And a test

    GO

    INSERT INTO Test VALUES( '20140101')

    GO

    INSERT INTO Test VALUES( '20140102')

    GO

    INSERT INTO Test VALUES( '20140103')

    GO

    INSERT INTO Test VALUES( '20140104')

    GO

    INSERT INTO Test VALUES( '20140105')

    GO

    INSERT INTO Test VALUES( '20140106')

    GO

    INSERT INTO Test VALUES( '20140107')

    GO

    SELECT * FROM Test

    GO

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Viewing 15 posts - 1 through 15 (of 19 total)

    You must be logged in to reply to this topic. Login to reply