SQL Query help - Delete records based on above and below record values

  • Hi

    I need a way to check the next and previous row values depending on another column value.. In the example below since ID's 3,4 are 0's and rows below and down are 100, i should ignore/delete those fields i.e.,3 and 4... also, if the rows above and below are not equal to 100 like for rows 6,7,8,9... i should consider all the values while doing other calculations ..so i need a way to look at the above/below rows..

    Id DID Value date

    1 1 100 2010-03-10

    2 1 100 2010-03-10

    3 1 0 2010-03-10

    4 1 0 2010-03-10

    5 1 100 2010-03-10

    6 1 100 2010-03-10

    7 1 0 2010-03-10

    8 1 0 2010-03-10

    9 1 20 2010-03-10

    CREATE TABLE ##temp

    (

    ID INT IDENTITY(1,1),

    DID INT,

    Value INT,

    Date DateTime

    )

    INSERT INTO ##temp (DID,Value,Date)

    SELECT 1,100,'2010-03-10 00:00:00' UNION ALL

    SELECT 1,100,'2010-03-10 00:05:00' UNION ALL

    SELECT 1,0,'2010-03-10 00:10:00' UNION ALL

    SELECT 1,0,'2010-03-10 00:15:00' UNION ALL

    SELECT 1,100,'2010-03-10 00:50:00' UNION ALL

    SELECT 1,100,'2010-03-10 10:00:00' UNION ALL

    SELECT 1,0,'2010-03-10 11:00:00' UNION ALL

    SELECT 1,0,'2010-03-10 11:05:00' UNION ALL

    SELECT 1,20,'2010-03-10 11:30:00'

    DROP TABLE ##Temp

    Thanks in advance

  • I would use the "quirky update" method to set a flag in a separate column (delete flag).

    For details on how to use this method please see Jeffs great article[/url].

    I consider it mandatory to completely read and follow the instructions / prerequisites as described in the article.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • I have seen the article before posting, but i am unable to utilize it to solve the issue, though it is related to my post.. Can you guide me. Thanks

  • Let's try to clarify the business requirement:

    What would be the expected output if you'd add a row with id=10 and value =100? What rows will be ignored? (my assumption: 3,4,7,8, and 9)

    How would the result look like with an additional row ID=0, value =10?

    The longer I think about the problem the more I think it's an issue of dealing with min(id) and max(id) for value=100 and how to deal with the rows in between.

    The "quirky update" solution might not be the best solution in this case...



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • The complete business requirement is described in this post

    http://qa.sqlservercentral.com/Forums/Topic940069-391-1.aspx?Update=1

    I was able to finish most part of it, except that i am unable to apply this rule.. Records should be ignored if and only if the 2 consecutive records above and below equal 100 .. so, even if you add row with Id 10 only 3,4 rows should be ignored. You will need to consider all other rows including row 10 for other calculations.. because row 9 value = 20. All these calculation should be performed per DID basis.. so, If you have a DID =0, we would have to do the same for 0 also..

    Hopefully, my post is clear

    Thanks

  • Posting your problem in multiple threads doesn't only waste resources, it also slows down the time until you get an answer that will actually solve your issue (especially if it is "urgent").

    I need some time to read through the parallel thread you mentioned. So it will take a while before I post back.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Your description of the requirements is really hard to understand, but the way I interpret you this should work:

    select * -- replace with delete to actually delete

    from ##temp t1

    where

    value = 0

    and (select top 1 value from ##temp t2 where t2.did = t1.did and t2.id < t1.id and t2.value <> 0 order by id desc) = 100

    and (select top 1 value from ##temp t2 where t2.did = t1.did and t2.id > t1.id and t2.value <> 0 order by id asc) = 100

    Let me know if it is not doing what it should.

  • After reading your other post a bit more carefully I now think that what you really want is something like this:

    CREATE TABLE ##temp

    (

    ID INT IDENTITY(1,1),

    DID INT,

    Value INT,

    Date DateTime

    )

    INSERT INTO ##temp (DID,Value,Date)

    SELECT 1,100,'2010-03-10 00:00:00' UNION ALL

    SELECT 1,100,'2010-03-10 00:05:00' UNION ALL

    SELECT 1,0,'2010-03-10 00:10:00' UNION ALL

    SELECT 1,null,'2010-03-10 00:15:00' UNION ALL

    SELECT 1,100,'2010-03-10 00:50:00' UNION ALL

    SELECT 1,100,'2010-03-10 10:00:00' UNION ALL

    SELECT 1,0,'2010-03-10 11:00:00' UNION ALL

    SELECT 1,100,'2010-03-10 11:02:00' UNION ALL

    SELECT 1,0,'2010-03-10 11:05:00' UNION ALL

    SELECT 1,0,'2010-03-10 11:30:00'

    select * from ##temp

    select *

    from ##temp t1

    where

    isnull(value,0) = 0

    -- check that the two previous not-zero rows with the same DID has the value 100

    and (select count(*) from (select top 2 value from ##temp t2 where t2.did=t1.did and t2.id < t1.id and isnull(t2.value,0)<>0 order by id desc) dt where value = 100) = 2

    -- check that the two following not-zero rows with the same DID has the value 100

    and (select count(*) from (select top 2 value from ##temp t2 where t2.did=t1.did and t2.id > t1.id and isnull(t2.value,0)<>0 order by id asc) dt where value = 100) = 2

    DROP TABLE ##temp

    Note that I have made some changes to your example data to expose the difference between my previous version and this version.

  • Posting your problem in multiple threads doesn't only waste resources, it also slows down the time until you get an answer that will actually solve your issue (especially if it is "urgent").

    I actually posted the complete business req in my 1st post but there was no response and almost got the solution except for this part.. so had to create another post .. 🙁

  • Thanks a lot stefan, You are a life saver.. it almost worked until i executed it against the below example.. Since none of the two Values above and below are equal to 100.. Only rows 16 and 17 should be ignored but when i run this query it actually

    Delete's(when replaced select with delete) row 4,6,8 which should be considered for further calculations apart from row 16,17.. Help me..Thanks Again

    INSERT INTO ##temp (DID,Value,Date)

    SELECT 1,100,'2010-03-10 00:00:00' UNION ALL

    SELECT 1,100,'2010-03-10 00:05:00' UNION ALL

    SELECT 1,100,'2010-03-10 00:10:00' UNION ALL

    SELECT 1,0,'2010-03-10 00:15:00' UNION ALL

    SELECT 1,100,'2010-03-10 00:50:00' UNION ALL

    SELECT 1,0,'2010-03-10 10:00:00' UNION ALL

    SELECT 1,100,'2010-03-10 11:00:00' UNION ALL

    SELECT 1,0,'2010-03-10 11:02:00' UNION ALL

    SELECT 1,100,'2010-03-10 11:02:00' UNION ALL

    SELECT 1,0,'2010-03-10 11:05:00' UNION ALL

    SELECT 1,0,'2010-03-10 11:30:00' UNION ALL

    SELECT 1,100,'2010-03-10 00:00:00' UNION ALL

    SELECT 1,10,'2010-03-10 00:05:00' UNION ALL

    SELECT 1,100,'2010-03-10 00:05:00' UNION ALL

    SELECT 1,100,'2010-03-10 00:10:00' UNION ALL

    SELECT 1,0,'2010-03-10 00:15:00' UNION ALL

    SELECT 1,0,'2010-03-10 11:30:00' UNION ALL

    SELECT 1,100,'2010-03-10 00:10:00' UNION ALL

    SELECT 1,100,'2010-03-10 00:00:00'

  • Hmm you are right.

    Try this instead:

    ;with cte as (

    select

    t1.*,

    -- get the id of the previous non-zero row

    (select top 1 id from ##temp t2 where t2.did=t1.did and t2.id < t1.id and isnull(t2.value,0)<>0 order by id desc) as id1,

    -- get the id of the following non-zero row

    (select top 1 id from ##temp t2 where t2.did=t1.did and t2.id > t1.id and isnull(t2.value,0)<>0 order by id asc) as id2

    from ##temp t1

    -- we are only interested in rows where value is 0 or null

    where isnull(t1.value,0) = 0

    )

    delete from ##temp where id in (

    select c.id

    from cte c

    -- join with the four rows that we are interested in

    join ##temp t1 on t1.id = c.id1

    join ##temp t2 on t2.id = c.id1-1 and t2.did = c.did

    join ##temp t3 on t3.id = c.id2

    join ##temp t4 on t4.id = c.id2+1 and t4.did = c.did

    -- check if all values are 100

    where t1.value=100 and t2.value=100 and t3.value=100 and t4.value=100

    )

    To get decent performance if the table is large you should make sure that you have a clustered index on the id column.

  • Thanks a ton, that works great.. i have changed it accordingly.. Now the only thing i am worried about is Performance because this query is just a part of the REQ and there is a lot of other calculations and its executed against 15 Million records daily.. did not test it against the whole set..But will let you know, once i completely test it..

    Thanks a lot again

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply