November 14, 2019 at 6:56 pm
I want to look for blocks of records repeating in a table.
pattern is unknown...to keep it simple - table X can have up to 100 records
in the example below:
The block: A,B and C is repeated 2 times
I don't care about (A and B) or (B and C) repeated twice...I am looking for the full scenario
Table X - Id (int),Value (varchar(8)
1,A
2,B
3,C
4,F
5,P
6,A
7,B
8,C
9,T
10,Y
in the example below:
The block: L,M,N,O is repeated 2 times
Table X - Id (int),Value (varchar(8)
1,L
2,M
3,N
4,O
5,P
6,A
7,L
8,M
9,N
10,O
I do not know how to go about this...values are being updated multiple times in a day from various overlapping jobs
November 14, 2019 at 7:07 pm
Can you post directly usable data, i.e., CREATE TABLE and INSERT statements.
SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!
November 15, 2019 at 4:23 pm
Here's a way using recursion, probably not efficient but appears to work with your data.
WITH recur AS (
SELECT x1.id AS idstart,
x1.value AS valuestart,
x1.id AS idend,
x2.id AS id2,
x1.value AS valueend,
1 as runlength
FROM X x1
CROSS APPLY (SELECT TOP 1 xt.id, xt.value FROM X xt WHERE xt.id > x1.id AND xt.value = x1.value ORDER BY xt.id) x2
UNION ALL
SELECT r.idstart,
r.valuestart,
x1.id AS idend,
x2.id AS id2,
x1.value AS valueend,
r.runlength+1
FROM recur r
INNER JOIN X x1 ON x1.id = r.idend + 1
INNER JOIN X x2 ON x2.id = r.id2 + 1 AND x2.value = x1.value
)
SELECT r.idstart,r.valuestart,r.valueend,r.runlength
FROM recur r
WHERE r.runlength > 1
AND NOT EXISTS(SELECT * FROM recur r2
WHERE r.idend BETWEEN r2.idstart AND r2.idend
AND (r2.idstart < r.idstart OR r2.idend > r.idend))
ORDER BY r.idstart;
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537November 15, 2019 at 5:44 pm
You didn't specify a length of the strings you wanted. This illustrates finding repeats from 1 to 4 strings long. The LEAD() function allows the query to look 1 to X rows ahead for the desired values. The code below simply concatenates values from several adjacent rows, "unpivots" the resulting columns (using cross apply VALUES) and does a summary query on the number of rows where the string appeared.
if object_ID(N'tempdb..#SourceData') is not null drop table #SourceData
select *
into #SourceData
from (VALUES
(1,'L')
,(2,'M')
,(3,'N')
,(4,'O')
,(5,'P')
,(6,'A')
,(7,'L')
,(8,'M')
,(9,'N')
,(10,'O')) v (ID,[Value])
;with cte as (
select ID
,V1 = [Value]
,V2 = [Value]+Lead(Value) Over(Order by ID)
,V3 = [Value]+Lead(Value) Over(Order by ID)+Lead(Value,2) Over(Order by ID)
,V4 = [Value]+Lead(Value) Over(Order by ID)+Lead(Value,2) Over(Order by ID)+Lead(Value,3) Over(Order by ID)
from #sourceData
)
select Vstring, count(*) as Occurs, Min(ID) as FirstOccurrence, Max(ID) as LastOccurrence
from cte
cross apply (Values (V1), (V2), (V3), (V4)) v (Vstring)
where Vstring is not null
group by Vstring
having Count(*) > 1
order by len(Vstring),Vstring
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply