Dynamically checking rows in a table for data changes in SQL

  • I have a table (example testing code below) that has a number of records grouped for a particular individual (identified by the first two fields of the table) over a number of dates. This contains the address information at that particular date.

    My problem is I need to build up an address history of the start and end date that person was at a particular address. In other words, I need to look at the table in date order for an individual and look for when the address next changes. My desired result is in the address change spreadsheet attached i.e. I need a "From date" and "to date" for each distinct address, with the last record having a blank to date as that's the current address.

    How would I go about this please?

    create table #temp (

    [RecNum] [nvarchar](6),

    [RecInd] int,

    [Address1] nvarchar(30),

    [DateApplied] [datetime] NOT NULL

    ) ON [PRIMARY]

    Insert #temp

    select recnum = '100001', 1,'10 Acacia Avenue','01 sep 2010'

    union all

    select recnum = '100001', 1,'10 Acacia Avenue','01 Oct 2010'

    union all

    select recnum = '100001', 1,'10 Acacia Avenue','01 Nov 2010'

    union all

    select recnum = '100001', 1,'11 Acacia Avenue','01 dec 2010'

    union all

    select recnum = '100001', 1,'11 Acacia Avenue','01 jan 2011'

    union all

    select recnum = '100001', 1,'10 Banana Street','01 feb 2011'

    union all

    select recnum = '100001', 1,'10 Banana Street','01 mar 2011'

    union all

    select recnum = '100001', 1,'10 Banana Street','01 apr 2011'

    union all

    select recnum = '100025', 0,'10 Downing Street','01 jan 2011'

    union all

    select recnum = '100025', 0,'1 Pennsylvania Avenue','01 feb 2011'

  • What is the relationship between RecNum and RecInd? I've left out RecInd in the solution below because it looks redundant.

    The first thing to do is to group by RecNum and Address1 - this will give you the date on which each person moved into each address:

    SELECT RecNum, Address1, MIN(DateApplied) AS [From Date]

    FROM #temp

    GROUP BY RecNum, Address1

    ORDER BY RecNum, MIN(DateApplied)

    Now, since the To Date is just the From Date from the previous row where RecInd is the same, or blank (NULL) where it isn't, I think I would leave it to my presentation layer to add that column to my result set. If you insist on doing it in T-SQL, you could number your rows (partitioned by RecNum) and do a self-join on RecNum = RecNum and RowNo = RowNo+1. Have a read about ROW_NUMBER to find out how to number your rows.

    John

  • John Mitchell-245523 (8/26/2011)


    What is the relationship between RecNum and RecInd? I've left out RecInd in the solution below because it looks redundant.

    The first thing to do is to group by RecNum and Address1 - this will give you the date on which each person moved into each address:

    SELECT RecNum, Address1, MIN(DateApplied) AS [From Date]

    FROM #temp

    GROUP BY RecNum, Address1

    ORDER BY RecNum, MIN(DateApplied)

    Now, since the To Date is just the From Date from the previous row where RecInd is the same, or blank (NULL) where it isn't, I think I would leave it to my presentation layer to add that column to my result set. If you insist on doing it in T-SQL, you could number your rows (partitioned by RecNum) and do a self-join on RecNum = RecNum and RowNo = RowNo+1. Have a read about ROW_NUMBER to find out how to number your rows.

    John

    If you think of RecNum as being a customer but a customer can me Mr Bloggs and Mrs Bloggs. RecInd is the identifier for the "party" i.e. RecInd 0 would be Mr Bloggs and RecInd 1 would be Mrs Bloggs (they may have split up for example).

    Mmm. I'll see if I can work that out.

  • Griffster (8/26/2011)


    John Mitchell-245523 (8/26/2011)


    What is the relationship between RecNum and RecInd? I've left out RecInd in the solution below because it looks redundant.

    The first thing to do is to group by RecNum and Address1 - this will give you the date on which each person moved into each address:

    SELECT RecNum, Address1, MIN(DateApplied) AS [From Date]

    FROM #temp

    GROUP BY RecNum, Address1

    ORDER BY RecNum, MIN(DateApplied)

    Now, since the To Date is just the From Date from the previous row where RecInd is the same, or blank (NULL) where it isn't, I think I would leave it to my presentation layer to add that column to my result set. If you insist on doing it in T-SQL, you could number your rows (partitioned by RecNum) and do a self-join on RecNum = RecNum and RowNo = RowNo+1. Have a read about ROW_NUMBER to find out how to number your rows.

    John

    If you think of RecNum as being a customer but a customer can me Mr Bloggs and Mrs Bloggs. RecInd is the identifier for the "party" i.e. RecInd 0 would be Mr Bloggs and RecInd 1 would be Mrs Bloggs (they may have split up for example).

    Mmm. I'll see if I can work that out.

    Just thought, this won't work if the person moved away from an address to a different address, and then returned to the first address a while later. See what I mean?

  • Possibly. It works for the sample data you supplied. Please will you supply some data for which it won't work, and I'll take a look?

    John

  • John Mitchell-245523 (8/30/2011)


    Possibly. It works for the sample data you supplied. Please will you supply some data for which it won't work, and I'll take a look?

    John

    Here you go...

    create table #temp (

    [RecNum] [nvarchar](6),

    [RecInd] int,

    [Address1] nvarchar(30),

    [DateApplied] [datetime] NOT NULL )

    ON [PRIMARY]

    Insert #tempselect recnum = '100001', 1,'10 Acacia Avenue','01 sep 2010'

    union all

    select recnum = '100001', 1,'10 Acacia Avenue','01 Oct 2010'

    union all

    select recnum = '100001', 1,'10 Acacia Avenue','01 Nov 2010'

    union all

    select recnum = '100001', 1,'11 Banana Street','01 dec 2010'

    union all

    select recnum = '100001', 1,'11 Banana Street','01 jan 2011'

    union all

    select recnum = '100001', 1,'10 Acacia Avenue','01 feb 2011'

    union all

    select recnum = '100001', 1,'10 Acacia Avenue','01 mar 2011'

    union all

    select recnum = '100001', 1,'10 Acacia Avenue','01 apr 2011'

    union all

    select recnum = '100025', 0,'10 Downing Street','01 jan 2011'

    union all

    select recnum = '100025', 0,'1 Pennsylvania Avenue','01 feb 2011'

  • I see what the issue is here, but I havn't got time to give it much consideration at the moment. The issue is that you have the same person living at the same addresses for two separate periods, and you therefore want them counted separately. To get you started, you could search this site for "gaps and islands" and see whether that gives you any clues. Please post back and let us know how you're getting on.

    John

  • Hello,

    I tried to solve this problem in a chain of transformations, see the code below

    create table #temp (

    [RecNum] [nvarchar](6),

    [RecInd] int,

    [Address1] nvarchar(30),

    [DateApplied] [datetime] NOT NULL

    ) ON [PRIMARY]

    Insert #temp

    select recnum = '100001', 1,'10 Acacia Avenue','01 sep 2010' union all

    select recnum = '100001', 1,'10 Acacia Avenue','01 Oct 2010' union all

    select recnum = '100001', 1,'10 Acacia Avenue','01 Nov 2010' union all

    select recnum = '100001', 1,'11 Acacia Avenue','01 dec 2010' union all

    select recnum = '100001', 1,'11 Acacia Avenue','01 jan 2011' union all

    select recnum = '100001', 1,'10 Banana Street','01 feb 2011' union all

    select recnum = '100001', 1,'10 Banana Street','01 mar 2011' union all

    select recnum = '100001', 1,'10 Banana Street','01 apr 2011' union all

    select recnum = '100025', 0,'10 Downing Street','01 jan 2011' union all

    select recnum = '100025', 0,'1 Pennsylvania Avenue','01 feb 2011'

    ;

    -- First step. Add a RowNumber to be able to join the table with herself

    WITH OrderedAddresses AS (

    SELECT RecNum, RecInd, Address1, DateApplied

    , ROW_NUMBER() OVER (PARTITION BY RecNum, RecInd ORDER BY DateApplied) AS RowNumber

    FROM #Temp

    )

    -- Second step. Join the table with herself linking each row with the previous date;

    -- then filter to avoid repeated addresses and maintain only address changes.

    , FilteredAddresses AS (

    SELECT A.*

    FROM OrderedAddresses A

    LEFT JOIN OrderedAddresses B ON B.RecNum = A.RecNum AND B.RecInd = A.RecInd AND A.RowNumber = B.Rownumber + 1

    WHERE A.Address1 <> ISNULL(B.Address1, '')

    )

    -- Third step. Add another RowNumber, to apply another join.

    , OrderAgain AS (

    SELECT *, ROW_NUMBER() OVER (PARTITION BY RecNum, RecInd ORDER BY DateApplied) AS NewRowNumber

    FROM FilteredAddresses

    )

    -- Last step. Join the table with herself to obtain the date of leaving each address.

    SELECT A.*, B.DateApplied AS EndDate

    FROM OrderAgain A

    LEFT JOIN OrderAgain B ON B.RecNum = A.RecNum AND B.RecInd = A.RecInd AND B.NewRowNumber = A.NewRowNumber + 1

    drop table #temp

    Hope this helps,

    Francesc

  • Thanks Francesc, that works a dream! Good concept to work with in the future. Many thanks.

  • I have a slight problem with this and I'm really struggling to amend the code to cope with it. I have some really awkward files where the first address1 field is a null. Code falls over then. How would I cater for them please?

  • Maybe yo can replace these null values by some not null string. I tried replacing

    WITH OrderedAddresses AS (

    SELECT RecNum, RecInd, Address1, DateApplied

    , ROW_NUMBER() OVER (PARTITION BY RecNum, RecInd ORDER BY DateApplied) AS RowNumber

    FROM #Temp

    )

    by

    WITH OrderedAddresses AS (

    SELECT RecNum, RecInd, ISNULL(Address1, 'No address') AS Address1, DateApplied

    , ROW_NUMBER() OVER (PARTITION BY RecNum, RecInd ORDER BY DateApplied) AS RowNumber

    FROM #Temp

    )

    Francesc

  • Aha. That seems to have worked. I (stupidly) was concentrating on the second CTE (filteredaddresses) and was testing isnull(A.Address1,'') <> ISNULL(B.Address1, '')) in my where clause. Thanks again.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply