Selective change tracking in a column

Question

Post reply

Selective change tracking in a column

davidvarga086

Right there with Babe

Points: 747
More actions
October 19, 2017 at 2:07 am

#329263

Hi,

I have to count how many times a record changed from group 2 to group 1 or 3 per ID ordered by RowID

This is my table:
RowID ID GROUP 1 1 1 2 1 2 3 1 1 4 1 2 5 1 3 6 1 2 7 1 4 8 1 5 9 1 2 1 2 2 2 2 1 3 2 3

The end table should look like this:
RowID ID GROUP 3 1 1 5 1 3 2 2 1
This is where I'm trying to explain this a bit more:
So for all ID 1 records check how many times it was in group 2 then if it was in group 2 then check if the next record (based on the RowID) is in group 1 or 3 ignore rest, keep all the records where it changed from group 2 to 1 or 3
and do the same for ID 2,3,4 and so on.....

There must be a simple way to do this I just can't figure it out

Please help

Thank you

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

davidvarga086 Right there with Babe Points: 747 More actions · Answer 1

I figured it out 😀 I just leave the code here for future generations

SELECT B1.* FROM ( SELECT *,ROW_NUMBER() over(partition by B1.ID order by B1.RowID desc) as P_count FROM #tempbase B1 ) B1 LEFT JOIN ( SELECT *,ROW_NUMBER() over(partition by B2.ID order by B2.RowID desc) as P_count2 FROM #tempbase B2 ) B2 ON B1.ID = B2.ID AND B2.Group = 2 AND B1.Group IN(1,3) AND B1.P_count = B2.P_count2-1 WHERE B2.ID IS NOT NULL

drew.allen SSC Guru Points: 76776 More actions · Answer 2

The problem with this approach is that it requires two scans of the base table. Here is an approach that only requires one scan of the table.

; WITH Chg AS ( SELECT *, LAG(Grp) OVER(PARTITION BY ID ORDER BY RowID) AS PrevGroup FROM #tempbase ) SELECT RowID, ID, Grp FROM Chg WHERE Grp IN (1,3) AND PrevGroup = 2 ;

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 3

drew.allen - Thursday, October 19, 2017 8:48 AM

The problem with this approach is that it requires two scans of the base table. Here is an approach that only requires one scan of the table.

; WITH Chg AS ( SELECT *, LAG(Grp) OVER(PARTITION BY ID ORDER BY RowID) AS PrevGroup FROM #tempbase ) SELECT RowID, ID, Grp FROM Chg WHERE Grp IN (1,3) AND PrevGroup = 2 ;

Drew

I recommend always being fully expressive in code and not relying on defaults. So flesh out that LAG so everyone that reads/debugs/etc the code knows explicitly what it is doing.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

drew.allen SSC Guru Points: 76776 More actions · Answer 4

TheSQLGuru - Thursday, October 19, 2017 1:54 PM

drew.allen - Thursday, October 19, 2017 8:48 AM

The problem with this approach is that it requires two scans of the base table. Here is an approach that only requires one scan of the table.

; WITH Chg AS ( SELECT *, LAG(Grp) OVER(PARTITION BY ID ORDER BY RowID) AS PrevGroup FROM #tempbase ) SELECT RowID, ID, Grp FROM Chg WHERE Grp IN (1,3) AND PrevGroup = 2 ;

Drew

I recommend always being fully expressive in code and not relying on defaults. So flesh out that LAG so everyone that reads/debugs/etc the code knows explicitly what it is doing.

I don't think there is a right or wrong here. There are merits to both approaches. I prefer leaving out the defaults if they match what I would use anyhow, because it's less typing. The only time that I have seen an issue with using the defaults is in the frame for windowed functions, because many people are unaware of when a default frame is applied.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

Jason A. Long SSC-Insane Points: 23731 More actions · Answer 5

I've got to go with Kevin on this one... My soul dies a little every time I review a script that has a CREATE TABLE script that looks like this...

CREATE TABLE ImAnAss ( ID INT IDENTITY PRIMARY KEY, SomeColumn VARCHAR, AnotherColumn DECIMAL DEFAULT (0), TheDate DATETIME DEFAULT (GETDATE()) );

drew.allen SSC Guru Points: 76776 More actions · Answer 6

Jason A. Long - Thursday, October 19, 2017 3:25 PM

I've got to go with Kevin on this one... My soul dies a little every time I review a script that has a CREATE TABLE script that looks like this...

CREATE TABLE ImAnAss ( ID INT IDENTITY PRIMARY KEY, SomeColumn VARCHAR, AnotherColumn DECIMAL DEFAULT (0), TheDate DATETIME DEFAULT (GETDATE()) );

I think that tables and variables are different, because they should be designed to match the expected data. Expected data varies too widely to rely on any defaults. I also like naming my constraints so that they're meaningful, which means that they would be different from the default names provided.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 7

The most important thing you said there Drew was "...should be...". But sadly they aren't. Probably 99% of all columns I have ever seen in SQL Server applications (25 years now) are NULLable (because that is the default), and the problems I have seen because of that are innumerable.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Jason A. Long SSC-Insane Points: 23731 More actions · Answer 8

TheSQLGuru - Friday, October 20, 2017 11:57 AM

The most important thing you said there Drew was "...should be...". But sadly they aren't. Probably 99% of all columns I have ever seen in SQL Server applications (25 years now) are NULLable (because that is the default), and the problems I have seen because of that are innumerable.