repeating block of records

  • I want to look for blocks of records repeating in a table.

     

    pattern is unknown...to keep it simple - table X can have up to 100 records

     

    in the example below:

    The block: A,B and C is repeated 2 times

    I don't care about (A and B) or (B and C) repeated twice...I am looking for the full scenario

     

    Table X - Id (int),Value (varchar(8)

    1,A

    2,B

    3,C

    4,F

    5,P

    6,A

    7,B

    8,C

    9,T

    10,Y

     

     

    in the example below:

    The block: L,M,N,O is repeated 2 times

     

    Table X - Id (int),Value (varchar(8)

    1,L

    2,M

    3,N

    4,O

    5,P

    6,A

    7,L

    8,M

    9,N

    10,O

     

    I do not know how to go about this...values are being updated multiple times in a day from various overlapping jobs

     

     

  • Can you post directly usable data, i.e., CREATE TABLE and INSERT statements.

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • Here's a way using recursion, probably not efficient but appears to work with your data.

    WITH recur AS (
    SELECT x1.id AS idstart,
           x1.value AS valuestart,
           x1.id AS idend,
        x2.id AS id2,
        x1.value AS valueend,
        1 as runlength
    FROM X x1
    CROSS APPLY (SELECT TOP 1 xt.id, xt.value FROM X xt WHERE xt.id > x1.id AND xt.value = x1.value ORDER BY xt.id) x2
    UNION ALL
    SELECT r.idstart,
           r.valuestart,
           x1.id AS idend,
        x2.id AS id2,
        x1.value AS valueend,
        r.runlength+1
    FROM recur r
    INNER JOIN X x1 ON x1.id = r.idend + 1
    INNER JOIN X x2 ON x2.id = r.id2 + 1 AND x2.value = x1.value
    )
    SELECT r.idstart,r.valuestart,r.valueend,r.runlength
    FROM recur r
    WHERE r.runlength > 1
      AND NOT EXISTS(SELECT * FROM recur r2
                     WHERE r.idend BETWEEN r2.idstart AND r2.idend
           AND (r2.idstart < r.idstart OR r2.idend > r.idend))
    ORDER BY r.idstart;

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • You didn't specify a length of the strings you wanted.   This illustrates finding repeats from 1 to 4 strings long.    The LEAD() function allows the query to look 1 to X rows ahead for the desired values.    The code below simply concatenates values from several adjacent rows, "unpivots" the resulting columns (using cross apply VALUES) and does a summary query on the number of rows where the string appeared.

    if object_ID(N'tempdb..#SourceData') is not null drop table #SourceData
    select *
    into #SourceData
    from (VALUES
    (1,'L')
    ,(2,'M')
    ,(3,'N')
    ,(4,'O')
    ,(5,'P')
    ,(6,'A')
    ,(7,'L')
    ,(8,'M')
    ,(9,'N')
    ,(10,'O')) v (ID,[Value])

    ;with cte as (
    select ID
    ,V1 = [Value]
    ,V2 = [Value]+Lead(Value) Over(Order by ID)
    ,V3 = [Value]+Lead(Value) Over(Order by ID)+Lead(Value,2) Over(Order by ID)
    ,V4 = [Value]+Lead(Value) Over(Order by ID)+Lead(Value,2) Over(Order by ID)+Lead(Value,3) Over(Order by ID)
    from #sourceData
    )

    select Vstring, count(*) as Occurs, Min(ID) as FirstOccurrence, Max(ID) as LastOccurrence
    from cte
    cross apply (Values (V1), (V2), (V3), (V4)) v (Vstring)
    where Vstring is not null
    group by Vstring
    having Count(*) > 1
    order by len(Vstring),Vstring

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply