Completing Fixed Sets With Another Table

  • Hi,

    I'm having some trouble finding a solution to a problem, its been a few hours now and I think I just completely loose track of what I'm doing and ended up a few times bashing and mix matching keywords around the lines of 'OUTER', 'MERGE', 'RIGHT', EXCEPT' and 'CROSS APPLY'. I need some extra help with this one.

    I've got a table that holds batches of 52-53 records (one per week of the year) per every PersonId. Problem was that many of them had duplicated records for X year (eg. 2 or 3 year 2015 batches), deleting the duplicates was more or less a pain but now the task is to complete the ones that are missing weeks, so some of them have 20 records for year 2015 (52 week year) then I need to add the missing 32 weeks.

    Payment table is a bit like:

    CREATE TABLE #Payment (

    PaymentId int,

    PersonId int,

    PaymentYear int,

    PaymentWeek int,

    EePaid decimal(18,2),

    ErPaid decimal (18,2))

    DECLARE @i INT = 1

    WHILE (@i <= 52)

    BEGIN

    INSERT INTO #Payment (

    PaymentId,

    PersonId,

    PaymentYear,

    PaymentWeek,

    EePaid,

    ErPaid)

    VALUES

    (999 + @i, 1, 2015, @i, 0, 0)

    SET @i = @i + 1

    END

    INSERT INTO #Payment (

    PaymentId,

    PersonId,

    PaymentYear,

    PaymentWeek,

    EePaid,

    ErPaid)

    VALUES

    (1256, 2, 2015, 26, 0, 0),

    (1257, 2, 2015, 27, 0, 0),

    (1258, 2, 2015, 28, 0, 0),

    (1259, 2, 2015, 29, 0, 0),

    (1260, 2, 2015, 30, 0, 0),

    (1261, 2, 2015, 31, 0, 0),

    (1262, 2, 2015, 32, 0, 0)

    SELECT * FROM #Payment

    As you can see PersonId 1 have a full year 2015 but PersonId 2 only has week 26 'till 32

    I'm getting the corresponding Weeks of the year with the following CTE:

    WITH

    E(n) AS(

    SELECT 0 AS n UNION ALL

    SELECT 0 AS n UNION ALL

    SELECT 0 AS n UNION ALL

    SELECT 0 AS n UNION ALL

    SELECT 0 AS n UNION ALL

    SELECT 0 AS n UNION ALL

    SELECT 0 AS n UNION ALL

    SELECT 0 AS n UNION ALL

    SELECT 0 AS n UNION ALL

    SELECT 0 AS n

    ),

    cteDates(Monday) AS(

    SELECT TOP(DATEDIFF(dd, '2005-01-01', DATEADD(yy, DATEDIFF(yy,0,getdate()) + 1, -1))/7)

    DATEADD(ww, (DATEDIFF(dd, 0, '2005-01-01')/7) + ROW_NUMBER() OVER(ORDER BY (SELECT NULL)), 0) Monday

    FROM E a, E b, E c, E d

    )

    SELECT

    Monday AS WkStDt

    ,YEAR(Monday) AS WkYear

    ,ROW_NUMBER() OVER(PARTITION BY YEAR(Monday) ORDER BY Monday) AS WkNo

    FROM cteDates

    So I figured I could Insert from a join but my theory came up short and I ended up with all sorts, here's my last take and its a general mess, but will illustrate perfectly the insanity I reached trying to solve this problem. (* WeekList in the query is the CTE)

    insert into Payment (

    PersonId,

    PaymentYear,

    WeekNumber,

    WeekStartDate

    )

    select

    t1.PersonId,

    t1.WkYear,

    t1.WkNo,

    t1.WkStDt

    from

    (

    SELECT

    Person.PersonId,

    CA.WkYear,

    CA.WkNo,

    CA.WkStDt

    FROM

    Person

    INNER JOIN

    (

    SELECT

    Person.PayeeNo,

    WeekList.WkYear,

    WeekList.WkNo,

    WeekList.WkStDt

    FROM Person CROSS APPLY WeekList

    ) CA ON Person.PayeeNo = CA.PayeeNo

    WHERE

    Person.PayeeNo IN

    (

    select sd.PayeeNo from

    (

    select PayeeNo, PaymentYear

    from Payment

    group by PensionNo, PaymentYear

    having count(*) < 52

    ) sd

    )

    INTERSECT

    SELECT

    Payment.PersonId,

    Payment.PayeeNo,

    Payment.PaymentYear,

    Payment.WeekNumber,

    Payment.WeekStartDate

    FROM Payment

    WHERE

    Payment.PayeeNo IN

    (

    select sd.PensionNo from

    (

    select PayeeNo, PaymentYear

    from Payment

    group by PayeeNo, PaymentYear

    having count(*) < 52

    ) sd

    )

    ) t1

    Any help would be appretiated :hehe:


    My stack: VS2015 ~ C# ~ MSSQL 2014[/url] ~ DevCraft ~ ReSharper ~ DataGrip ~ Linqer[/url]

  • Not a complete solution but should help you get over this hurdle

    😎

    USE TEEST;

    GO

    SET NOCOUNT ON;

    IF OBJECT_ID(N'tempdb..#Payment') IS NOT NULL DROP TABLE #Payment;

    CREATE TABLE #Payment (

    PaymentId int,

    PersonId int,

    PaymentYear int,

    PaymentWeek int,

    EePaid decimal(18,2),

    ErPaid decimal (18,2))

    DECLARE @i INT = 1

    WHILE (@i <= 52)

    BEGIN

    INSERT INTO #Payment (

    PaymentId,

    PersonId,

    PaymentYear,

    PaymentWeek,

    EePaid,

    ErPaid)

    VALUES

    (999 + @i, 1, 2015, @i, 0, 0)

    SET @i = @i + 1

    END

    INSERT INTO #Payment (

    PaymentId,

    PersonId,

    PaymentYear,

    PaymentWeek,

    EePaid,

    ErPaid)

    VALUES

    (1256, 2, 2015, 26, 0, 0),

    (1257, 2, 2015, 27, 0, 0),

    (1258, 2, 2015, 28, 0, 0),

    (1259, 2, 2015, 29, 0, 0),

    (1260, 2, 2015, 30, 0, 0),

    (1261, 2, 2015, 31, 0, 0),

    (1262, 2, 2015, 32, 0, 0)

    ;WITH BASE_DATA AS

    (

    SELECT

    PM.PaymentId

    ,PM.PersonId

    ,PM.PaymentYear

    ,PM.PaymentWeek

    ,PM.EePaid

    ,PM.ErPaid

    ,COUNT(*) OVER

    (

    PARTITION BY PM.PersonId

    ,PM.PaymentYear

    ) AS PM_CNT

    FROM #Payment PM

    )

    ,CALENDAR_CONFIG AS

    (

    SELECT

    MIN(BD.PaymentYear) AS START_YEAR

    ,DATEADD(WEEK,CEILING(DATEDIFF(DAY,0,DATEFROMPARTS(MIN(BD.PaymentYear),1,1)) / 7.0),0) AS FIRST_MONDAY

    ,DATEADD(WEEK,FLOOR(DATEDIFF(DAY,0,DATEFROMPARTS(YEAR(GETDATE()),12,31)) / 7.0),0) AS LAST_MONDAY

    ,DATEDIFF(WEEK,DATEFROMPARTS(MIN(BD.PaymentYear),1,1),DATEFROMPARTS(YEAR(GETDATE()),12,31)) AS NUM_WEEKS

    FROM BASE_DATA BD

    )

    ,T(N) AS (SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) X(N))

    ,ALL_MONDAYS(MDATE) AS

    (

    SELECT

    TOP((SELECT CC.NUM_WEEKS FROM CALENDAR_CONFIG CC))

    DATEADD(WEEK,ROW_NUMBER() OVER (ORDER BY @@VERSION) - 1,CC.FIRST_MONDAY)

    FROM CALENDAR_CONFIG CC

    CROSS APPLY T T1,T T2,T T3,T T4,T T5

    )

    ,CALENDAR(YYYY,WW) AS

    (

    SELECT

    YEAR(AM.MDATE) AS PaymentYear

    ,ROW_NUMBER() OVER

    (

    PARTITION BY YEAR(AM.MDATE)

    ORDER BY AM.MDATE

    ) AS PaymentWeek

    FROM ALL_MONDAYS AM

    )

    ,SUBJECT_LIST(PersonId) AS

    (

    SELECT

    DISTINCT PM.PersonId

    FROM #Payment PM

    )

    ,MISSING_WEEKS AS

    (

    SELECT

    SL.PersonId

    ,C.YYYY

    ,C.WW

    FROM SUBJECT_LIST SL

    CROSS APPLY CALENDAR C

    EXCEPT

    SELECT

    PM.PersonId

    ,PM.PaymentYear

    ,PM.PaymentWeek

    FROM #Payment PM

    )

    SELECT

    *

    FROM MISSING_WEEKS MW

    LEFT OUTER JOIN #Payment PM

    ON MW.PersonId = PM.PersonId

    AND MW.YYYY = PM.PaymentYear

    AND MW.WW = PM.PaymentWeek;

  • sys.user (10/30/2016)


    Hi,

    I'm having some trouble finding a solution to a problem, its been a few hours now and I think I just completely loose track of what I'm doing and ended up a few times bashing and mix matching keywords around the lines of 'OUTER', 'MERGE', 'RIGHT', EXCEPT' and 'CROSS APPLY'. I need some extra help with this one.

    I've got a table that holds batches of 52-53 records (one per week of the year) per every PersonId. Problem was that many of them had duplicated records for X year (eg. 2 or 3 year 2015 batches), deleting the duplicates was more or less a pain but now the task is to complete the ones that are missing weeks, so some of them have 20 records for year 2015 (52 week year) then I need to add the missing 32 weeks.

    Payment table is a bit like:

    CREATE TABLE #Payment (

    PaymentId int,

    PersonId int,

    PaymentYear int,

    PaymentWeek int,

    EePaid decimal(18,2),

    ErPaid decimal (18,2))

    DECLARE @i INT = 1

    WHILE (@i <= 52)

    BEGIN

    INSERT INTO #Payment (

    PaymentId,

    PersonId,

    PaymentYear,

    PaymentWeek,

    EePaid,

    ErPaid)

    VALUES

    (999 + @i, 1, 2015, @i, 0, 0)

    SET @i = @i + 1

    END

    INSERT INTO #Payment (

    PaymentId,

    PersonId,

    PaymentYear,

    PaymentWeek,

    EePaid,

    ErPaid)

    VALUES

    (1256, 2, 2015, 26, 0, 0),

    (1257, 2, 2015, 27, 0, 0),

    (1258, 2, 2015, 28, 0, 0),

    (1259, 2, 2015, 29, 0, 0),

    (1260, 2, 2015, 30, 0, 0),

    (1261, 2, 2015, 31, 0, 0),

    (1262, 2, 2015, 32, 0, 0)

    SELECT * FROM #Payment

    As you can see PersonId 1 have a full year 2015 but PersonId 2 only has week 26 'till 32

    I'm getting the corresponding Weeks of the year with the following CTE:

    WITH

    E(n) AS(

    SELECT 0 AS n UNION ALL

    SELECT 0 AS n UNION ALL

    SELECT 0 AS n UNION ALL

    SELECT 0 AS n UNION ALL

    SELECT 0 AS n UNION ALL

    SELECT 0 AS n UNION ALL

    SELECT 0 AS n UNION ALL

    SELECT 0 AS n UNION ALL

    SELECT 0 AS n UNION ALL

    SELECT 0 AS n

    ),

    cteDates(Monday) AS(

    SELECT TOP(DATEDIFF(dd, '2005-01-01', DATEADD(yy, DATEDIFF(yy,0,getdate()) + 1, -1))/7)

    DATEADD(ww, (DATEDIFF(dd, 0, '2005-01-01')/7) + ROW_NUMBER() OVER(ORDER BY (SELECT NULL)), 0) Monday

    FROM E a, E b, E c, E d

    )

    SELECT

    Monday AS WkStDt

    ,YEAR(Monday) AS WkYear

    ,ROW_NUMBER() OVER(PARTITION BY YEAR(Monday) ORDER BY Monday) AS WkNo

    FROM cteDates

    So I figured I could Insert from a join but my theory came up short and I ended up with all sorts, here's my last take and its a general mess, but will illustrate perfectly the insanity I reached trying to solve this problem. (* WeekList in the query is the CTE)

    insert into Payment (

    PersonId,

    PaymentYear,

    WeekNumber,

    WeekStartDate

    )

    select

    t1.PersonId,

    t1.WkYear,

    t1.WkNo,

    t1.WkStDt

    from

    (

    SELECT

    Person.PersonId,

    CA.WkYear,

    CA.WkNo,

    CA.WkStDt

    FROM

    Person

    INNER JOIN

    (

    SELECT

    Person.PayeeNo,

    WeekList.WkYear,

    WeekList.WkNo,

    WeekList.WkStDt

    FROM Person CROSS APPLY WeekList

    ) CA ON Person.PayeeNo = CA.PayeeNo

    WHERE

    Person.PayeeNo IN

    (

    select sd.PayeeNo from

    (

    select PayeeNo, PaymentYear

    from Payment

    group by PensionNo, PaymentYear

    having count(*) < 52

    ) sd

    )

    INTERSECT

    SELECT

    Payment.PersonId,

    Payment.PayeeNo,

    Payment.PaymentYear,

    Payment.WeekNumber,

    Payment.WeekStartDate

    FROM Payment

    WHERE

    Payment.PayeeNo IN

    (

    select sd.PensionNo from

    (

    select PayeeNo, PaymentYear

    from Payment

    group by PayeeNo, PaymentYear

    having count(*) < 52

    ) sd

    )

    ) t1

    Any help would be appretiated :hehe:

    What is your "week" based on", especially for the first and last week of the year where it's normally not a 7 day week? Also, do you have a company calendar table in the system?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden (10/30/2016)


    What is your "week" based on", especially for the first and last week of the year where it's normally not a 7 day week? Also, do you have a company calendar table in the system?

    Hi Jeff,

    Date settings are standard en-GB, Monday first day of the week, First week of the year is the first full week (eg, Week 1 this year 04/01/2016)


    My stack: VS2015 ~ C# ~ MSSQL 2014[/url] ~ DevCraft ~ ReSharper ~ DataGrip ~ Linqer[/url]

  • sys.user (10/30/2016)


    Jeff Moden (10/30/2016)


    What is your "week" based on", especially for the first and last week of the year where it's normally not a 7 day week? Also, do you have a company calendar table in the system?

    Hi Jeff,

    Date settings are standard en-GB, Monday first day of the week, First week of the year is the first full week (eg, Week 1 this year 04/01/2016)

    So, ISO Weeks then?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden (10/30/2016)


    sys.user (10/30/2016)


    Jeff Moden (10/30/2016)


    What is your "week" based on", especially for the first and last week of the year where it's normally not a 7 day week? Also, do you have a company calendar table in the system?

    Hi Jeff,

    Date settings are standard en-GB, Monday first day of the week, First week of the year is the first full week (eg, Week 1 this year 04/01/2016)

    So, ISO Weeks then?

    Yeah, ISO weeks


    My stack: VS2015 ~ C# ~ MSSQL 2014[/url] ~ DevCraft ~ ReSharper ~ DataGrip ~ Linqer[/url]

  • K. Working on a set-based method...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Since you're working with ISO Weeks, the following won't only help you with this particular problem but could also provide some help for you in the future. Hopefully, you don't need to calculate ISO Weeks or ISO Years for the year 9999 (I left that out for simplicity's sake).

    First, we need to generate counts. Instead of rewriting code over and over to do that, here's my iTVF for generating an on-the-fly Tally Table-like structure.

    CREATE FUNCTION [dbo].[fnTally]

    /**********************************************************************************************************************

    Purpose:

    Return a column of BIGINTs from @ZeroOrOne up to and including @MaxN with a max value of 1 Trillion.

    As a performance note, it takes about 00:02:10 (hh:mm:ss) to generate 1 Billion numbers to a throw-away variable.

    Usage:

    --===== Syntax example (Returns BIGINT)

    SELECT t.N

    FROM dbo.fnTally(@ZeroOrOne,@MaxN) t

    ;

    Notes:

    1. Based on Itzik Ben-Gan's cascading CTE (cCTE) method for creating a "readless" Tally Table source of BIGINTs.

    Refer to the following URLs for how it works and introduction for how it replaces certain loops.

    http://qa.sqlservercentral.com/articles/T-SQL/62867/

    http://sqlmag.com/sql-server/virtual-auxiliary-table-numbers

    2. To start a sequence at 0, @ZeroOrOne must be 0 or NULL. Any other value that's convertable to the BIT data-type

    will cause the sequence to start at 1.

    3. If @ZeroOrOne = 1 and @MaxN = 0, no rows will be returned.

    5. If @MaxN is negative or NULL, a "TOP" error will be returned.

    6. @MaxN must be a positive number from >= the value of @ZeroOrOne up to and including 1 Billion. If a larger

    number is used, the function will silently truncate after 1 Billion. If you actually need a sequence with

    that many values, you should consider using a different tool. ;-)

    7. There will be a substantial reduction in performance if "N" is sorted in descending order. If a descending

    sort is required, use code similar to the following. Performance will decrease by about 27% but it's still

    very fast especially compared with just doing a simple descending sort on "N", which is about 20 times slower.

    If @ZeroOrOne is a 0, in this case, remove the "+1" from the code.

    DECLARE @MaxN BIGINT;

    SELECT @MaxN = 1000;

    SELECT DescendingN = @MaxN-N+1

    FROM dbo.fnTally(1,@MaxN);

    8. There is no performance penalty for sorting "N" in ascending order because the output is explicity sorted by

    ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    Revision History:

    Rev 00 - Unknown - Jeff Moden

    - Initial creation with error handling for @MaxN.

    Rev 01 - 09 Feb 2013 - Jeff Moden

    - Modified to start at 0 or 1.

    Rev 02 - 16 May 2013 - Jeff Moden

    - Removed error handling for @MaxN because of exceptional cases.

    Rev 03 - 22 Apr 2015 - Jeff Moden

    - Modify to handle 1 Trillion rows for experimental purposes.

    **********************************************************************************************************************/

    (@ZeroOrOne BIT, @MaxN BIGINT)

    RETURNS TABLE WITH SCHEMABINDING AS

    RETURN WITH

    E1(N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1) --10E1 or 10 rows

    , E4(N) AS (SELECT 1 FROM E1 a, E1 b, E1 c, E1 d) --10E4 or 10 Thousand rows

    ,E12(N) AS (SELECT 1 FROM E4 a, E4 b, E4 c) --10E12 or 1 Trillion rows

    SELECT N = 0 WHERE ISNULL(@ZeroOrOne,0)= 0 --Conditionally start at 0.

    UNION ALL

    SELECT TOP(@MaxN) N = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E12 -- Values from 1 to @MaxN

    ;

    Up next are ISO WEEK and YEAR calculations. Microsoft didn't do us any favors with ISOWK here... they don't have a similar function to generate the ISO year. With that in mind, I finally got off my haunches and made one. This works as far back as SQLServer 2005. Details are in the comments, as always.

    CREATE FUNCTION dbo.IsoWeekCalendar

    /**********************************************************************************************************************

    Purpose:

    Given a start and end date, create an on-the-fly ISO Week table for the years involved as identified by the start and

    end date. Important: See "Developer Notes 1." below for limits on dates.

    Usage Examples:

    --===== Basic syntax

    SELECT *

    FROM dbo.IsoWeekCalendar(@pStartDate,@pEndDate)

    ;

    --===== Full DATETIME example

    SELECT *

    FROM dbo.IsoWeekCalendar('2015-04-05 15:35:27.123','2016-06-07 09:25:18.997')

    ;

    --===== Whole year example (years must be in single quotes if a literal)

    SELECT *

    FROM dbo.IsoWeekCalendar('1753','9998') --Demonstration of the min and max limits of this function.

    ;

    Developer Notes:

    1. The low value for any date should have a year >= 1753 and the high value for any date should have a year <= 9998.

    2. Derived from an original concept created by "t-clausen.dk" at the following link:

    http://stackoverflow.com/questions/7330711/isoweek-in-sql-server-2005

    3. Functionality improved by Peter Larsson and Jeff Moden and explained in full by Jeff Moden at the following link:

    http://qa.sqlservercentral.com/articles/T-SQL/97910/

    Dependencies:

    1. Requires SQL Server 2005 or better.

    2. Requires separate dbo.fnTally inline Table Valued Function to already exist and must be able to return a "0"

    to start a sequence.

    Revision History:

    Rev 00 - 30 Oct 2016 - Jeff Moden

    - Initial creation and unit test to support the following forum post:

    - http://qa.sqlservercentral.com/Forums/Topic1830292-3412-1.aspx

    **********************************************************************************************************************/

    --===== Parameters for this function

    (

    @pStartDate DATETIME

    ,@pEndDate DATETIME

    )

    RETURNS TABLE WITH SCHEMABINDING AS

    RETURN WITH

    cteStartEndDates AS

    ( --=== Find the Monday for the first year involved (might be in the previous year)

    -- and the first Monday after the last year involved (might be in the next year)

    SELECT StartDate = DATEADD(dd,DATEDIFF(dd,0,DATEADD(yy,DATEDIFF(yy, 0,@pStartDate),0))/7*7 ,0)

    ,EndDate = DATEADD(dd,DATEDIFF(dd,0,DATEADD(yy,DATEDIFF(yy,-1,@pEndDate ),0))/7*7+7,0)

    )

    , cteAllDates AS

    ( --=== Build all the Monday/Thursday dates for the date range identified in the previous CTE.

    -- We need the Thursday date to calculate the ISO Year.

    SELECT Monday = DATEADD(dd,t.N*7 ,dt.StartDate)

    ,Thursday = DATEADD(dd,t.N*7+3,dt.StartDate)

    FROM cteStartEndDates dt

    CROSS APPLY dbo.fnTally(0,DATEDIFF(dd,dt.StartDate,dt.EndDate)/7) t

    )

    , cteISOWeekCalendar AS

    ( --=== Create the ISO Week Calendar for the years involved by the input parameters and some spillover.

    SELECT WeekStart = ad.Monday

    ,NextWeekStart = DATEADD(dd,7,ad.Monday)

    ,ISOWeek = (DATEPART(dy,DATEADD(dd,DATEDIFF(dd,'17530101',ad.Monday)/7*7,'17530104'))+6)/7

    ,ISOYear = DATEPART(yy,ad.Thursday)

    FROM cteAllDates ad

    )

    --===== Return the ISO Week Calendar for the years involved by the input parameters without spillover.

    SELECT wc.WeekStart, wc.NextWeekStart, wc.ISOYear, wc.ISOWeek

    FROM cteISOWeekCalendar wc

    WHERE wc.ISOYear BETWEEN DATEPART(yy,@pStartDate) AND DATEPART(yy,@pEndDate)

    ;

    GO

    After that, everything else becomes pretty easy for this and other problems. Here's a solution to the problem you posted using the test data you posted. Notice that you don't have to really know anything about the years involved for each person. The code figures all that out.

    WITH ctePersonPmtDates AS

    ( --=== Find the min and max payment years for each PersonID

    SELECT PersonID

    ,MinPmtYear = CONVERT(CHAR(4),MIN(PaymentYear))

    ,MaxPmtYear = CONVERT(CHAR(4),MAX(PaymentYear))

    FROM #Payment

    GROUP BY PersonID

    )

    , cteCalendarPerPerson AS

    (--=== Create a full year of payment weeks per PersonID per Year

    SELECT pd.PersonID

    ,wc.ISOYear

    ,wc.ISOWeek

    FROM ctePersonPmtDates pd

    CROSS APPLY dbo.ISOWeekCalendar(pd.MinPmtYear, pd.MaxPmtYear) wc

    ) --=== Now, just do an outer join to return a full year of payments for each PersonID

    -- observing any payments already made. Missing payment info is identified by

    -- those rows in the return that have no PaymentID.

    SELECT p.PaymentId --will be NULL if no existing payment

    ,c.PersonId

    ,PaymentYear = c.ISOYear

    ,PaymentWeek = c.ISOWeek

    ,EePaid = ISNULL(p.EePaid,0)

    ,ErPaid = ISNULL(p.ErPaid,0)

    FROM cteCalendarPerPerson c

    LEFT JOIN #Payment p

    ON p.PersonId = c.PersonId

    AND p.PaymentYear = c.ISOYear

    AND p.PaymentWeek = c.ISOWeek

    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • sys.user (10/30/2016)


    Jeff Moden (10/30/2016)


    sys.user (10/30/2016)


    Jeff Moden (10/30/2016)


    What is your "week" based on", especially for the first and last week of the year where it's normally not a 7 day week? Also, do you have a company calendar table in the system?

    Hi Jeff,

    Date settings are standard en-GB, Monday first day of the week, First week of the year is the first full week (eg, Week 1 this year 04/01/2016)

    So, ISO Weeks then?

    Yeah, ISO weeks

    Just dawned on me and wasn't paying attention like should. You said that the first week of the year is the first FULL week of the year. That's not how ISO works and I might not have done you any favors. ISO uses the rule that the first week of the year is whichever week contains the 4th of January.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Hey, thanks for the help...

    In the end I just couldnt make it work the automated way:

    "for each record on a list of records with missing weeks insert missing weeks for X year"

    So what I ended up doing was a less fancy way like:

    Create a temporary copy of the Payment table and then insert the records into there, use my CTE and add the full week sets into the Payment table, then update using the data on my temporary Payment table using where WeekNumber = WeekNumber, Year =Year and PersonId = PersonId...

    Worked like a charm


    My stack: VS2015 ~ C# ~ MSSQL 2014[/url] ~ DevCraft ~ ReSharper ~ DataGrip ~ Linqer[/url]

  • >> I'm having some trouble finding a solution to a problem, its been a few hours now and I think I just completely loose track of what I'm doing and ended up a few times bashing and mix matching keywords around the lines of 'OUTER', 'MERGE', 'RIGHT', EXCEPT' and 'CROSS APPLY'. I need some extra help with this one. <<

    "Before you can drink new tea, you must empty the old tea from your cup. – Zen proverb

    You do not understand that rows are nothing like records, you do not understand how you have to model data, and your committing all kinds of fundamental errors. Let us try to fix this; this is going to be a long post, but if you had to pay me as a consultant to do this it would cost you $$$and have a lot of power points and it. 🙂

    >> I've got a table that holds batches of 52-53 records [sic] (one per week of the year) per every person_id. <<

    We do not model persons in RDBMS for the same reason we do not model things. This is too generic! What role, in the data model, does this person play? Have you ever been around small children who do not have the words to express concepts yet, they pointed at things and say "thingy, thingy!" because they do not have the words nor do they have the concept of the Law of Identity from logic ("to be is to be something in particular; to be nothing in particular or everything in general is to be nothing at all.")

    |

    >> Problem was that many of them had duplicated records [sic] for X year (eg. 2 or 3 year '2015W batches), deleting the duplicates was more or less a pain but now the task is to complete the ones that are missing weeks, so some of them have 20 records [sic] for year '2015W (52 week year) then I need to add the missing 32 weeks. <<

    The duplicate records should have been filtered out before you tried to turn them into rows in the database. This is based on a tiered architecture. This is a fundamental concept of all modern data processing and you should know it.

    Records are very generic, very sloppy and they need to be cleaned up. This is why we switched from the traditional monolithic architecture of data processing to a tiered architecture the time he got to RDBMS.

    >> Payment table is a bit like: >>

    Your payment table example is fundamentally wrong. Sorry, but it is. A table models a set, so it is name is a plural or collective noun. We do know math on identifiers, so they can never, never be numeric. By definition, a table must have a key. But what you posted has all NULL columns so you can never have a key, by definition! This is basic data modeling.

    Apparently you never do any research. Look up the ISO 8601 standards for weeks and dates within the year. These are very popular in Scandinavian countries, etc.

    CREATE TABLE Payments

    (payment_nbr CHAR(5) NOT NULL PRIMARY KEY,

    payer_id CHAR(10) NOT NULL,

    payment_week_date CHAR(10) NOT NULL

    CHECK(payment_date LIKE '[12][0-9][0-9][0-9]W[0-5][0-9]-[1-7])

    ee_paid DECIMAL (18,2) NOT NULL

    CHECK (ee_paid >= 0.00),

    er_paid DECIMAL (18,2)NOT NULL

    CHECK (er_paid >= 0.00),

    );

    Please get in the habit of learning ISO standards if you are going to stay in IT as a profession. Your table should looked more like this:

    ('1256', '000002', '2015W26-1', 0.00, 0.00),

    ('1257', '000002', '2015W27-1', 0.00, 0.00),

    ('1258', '000002', '2015W28-1', 0.00, 0.00),

    ('1259', '000002', '2015W29-1', 0.00, 0.00),

    ('1260', '000002', '2015W30-1', 0.00, 0.00),

    ('1261', '000002', '2015W31-1', 0.00, 0.00),

    ('1262', '000002', '2015W32-1', 0.00, 0.00);

    But no good SQL programmer would write a while loop! This is a declarative language, we hate loops and other flow control structures the way vegans hate barbecue. Furthermore, no good SQL programmer would ever fill columns with zeros the way you have done. This is leaving punch card columns empty, written in SQL in 2016 instead of on real punchcards back in 1958. Did you want to use the NULL, that we have in SQL?

    If you do not need the full ISO 8601 week date, you can use the first few characters of the display string ([1,2][0-9][0-9][0-9]W[0-5][0-9]) to get the week within year. Your design error is called "attributes splitting" and it involves taking things that should have been represented as one scalar value and splitting them over multiple columns. This is a common newbie problem; do not feel bad.

    Now I am not sure exactly what you want to do with this data. Can you explain?

    Books in Celko Series for Morgan-Kaufmann Publishing
    Analytics and OLAP in SQL
    Data and Databases: Concepts in Practice
    Data, Measurements and Standards in SQL
    SQL for Smarties
    SQL Programming Style
    SQL Puzzles and Answers
    Thinking in Sets
    Trees and Hierarchies in SQL

  • CELKO (11/8/2016)


    >> I'm having some trouble finding a solution to a problem, its been a few hours now and I think I just completely loose track of what I'm doing and ended up a few times bashing and mix matching keywords around the lines of 'OUTER', 'MERGE', 'RIGHT', EXCEPT' and 'CROSS APPLY'. I need some extra help with this one. <<

    "Before you can drink new tea, you must empty the old tea from your cup. – Zen proverb

    You do not understand that rows are nothing like records, you do not understand how you have to model data, and your committing all kinds of fundamental errors. Let us try to fix this; this is going to be a long post, but if you had to pay me as a consultant to do this it would cost you $$$and have a lot of power points and it. 🙂

    >> I've got a table that holds batches of 52-53 records [sic] (one per week of the year) per every person_id. <<

    We do not model persons in RDBMS for the same reason we do not model things. This is too generic! What role, in the data model, does this person play? Have you ever been around small children who do not have the words to express concepts yet, they pointed at things and say "thingy, thingy!" because they do not have the words nor do they have the concept of the Law of Identity from logic ("to be is to be something in particular; to be nothing in particular or everything in general is to be nothing at all.")

    |

    >> Problem was that many of them had duplicated records [sic] for X year (eg. 2 or 3 year '2015W batches), deleting the duplicates was more or less a pain but now the task is to complete the ones that are missing weeks, so some of them have 20 records [sic] for year '2015W (52 week year) then I need to add the missing 32 weeks. <<

    The duplicate records should have been filtered out before you tried to turn them into rows in the database. This is based on a tiered architecture. This is a fundamental concept of all modern data processing and you should know it.

    Records are very generic, very sloppy and they need to be cleaned up. This is why we switched from the traditional monolithic architecture of data processing to a tiered architecture the time he got to RDBMS.

    >> Payment table is a bit like: >>

    Your payment table example is fundamentally wrong. Sorry, but it is. A table models a set, so it is name is a plural or collective noun. We do know math on identifiers, so they can never, never be numeric. By definition, a table must have a key. But what you posted has all NULL columns so you can never have a key, by definition! This is basic data modeling.

    Apparently you never do any research. Look up the ISO 8601 standards for weeks and dates within the year. These are very popular in Scandinavian countries, etc.

    CREATE TABLE Payments

    (payment_nbr CHAR(5) NOT NULL PRIMARY KEY,

    payer_id CHAR(10) NOT NULL,

    payment_week_date CHAR(10) NOT NULL

    CHECK(payment_date LIKE '[12][0-9][0-9][0-9]W[0-5][0-9]-[1-7])

    ee_paid DECIMAL (18,2) NOT NULL

    CHECK (ee_paid >= 0.00),

    er_paid DECIMAL (18,2)NOT NULL

    CHECK (er_paid >= 0.00),

    );

    Please get in the habit of learning ISO standards if you are going to stay in IT as a profession. Your table should looked more like this:

    ('1256', '000002', '2015W26-1', 0.00, 0.00),

    ('1257', '000002', '2015W27-1', 0.00, 0.00),

    ('1258', '000002', '2015W28-1', 0.00, 0.00),

    ('1259', '000002', '2015W29-1', 0.00, 0.00),

    ('1260', '000002', '2015W30-1', 0.00, 0.00),

    ('1261', '000002', '2015W31-1', 0.00, 0.00),

    ('1262', '000002', '2015W32-1', 0.00, 0.00);

    But no good SQL programmer would write a while loop! This is a declarative language, we hate loops and other flow control structures the way vegans hate barbecue. Furthermore, no good SQL programmer would ever fill columns with zeros the way you have done. This is leaving punch card columns empty, written in SQL in 2016 instead of on real punchcards back in 1958. Did you want to use the NULL, that we have in SQL?

    If you do not need the full ISO 8601 week date, you can use the first few characters of the display string ([1,2][0-9][0-9][0-9]W[0-5][0-9]) to get the week within year. Your design error is called "attributes splitting" and it involves taking things that should have been represented as one scalar value and splitting them over multiple columns. This is a common newbie problem; do not feel bad.

    Now I am not sure exactly what you want to do with this data. Can you explain?

    Damn! I like this...

    I'm not gonna defend my mess, but, I do wanna say that I tend to use singular table names for the sake of convenience when working with classes from Visual Studio...

    The data model (the table) has been around for many years, it has more columns actually, but the core purpose of it is to store payments (voluntary contribution payments) that go by a fixed weekly rate that is adjusted every year. Payments are made per week, weeks start on Monday and the first week of the year is the first full week of the year, ISO 8601.

    The mechanic goes by (for a new person): Person is added to the Person table => Payments are started for the current year by adding a full batch of records for the 52-53 weeks of the year (clearly all values zero) => Weeks start being filled (UPDATE) as person pays his/her weeks.

    I never agreed with the model of having to insert 52-53 records even if the person is paying 1 week, but that's the way the staff is used to work with this since they're able to 'see' a full year and whats missing.

    See this fragment of the UI:

    I did ask about this model in the past and the only suggestion/comment I've got was something around the lines of "if its not broken..."

    Table has a few million records right now...


    My stack: VS2015 ~ C# ~ MSSQL 2014[/url] ~ DevCraft ~ ReSharper ~ DataGrip ~ Linqer[/url]

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply