SQL Query help - Delete records based on above and below record values

Question

Post reply

SQL Query help - Delete records based on above and below record values

danny09

Say Hey Kid

Points: 688
More actions
June 22, 2010 at 11:27 am

#225891

Hi

I need a way to check the next and previous row values depending on another column value.. In the example below since ID's 3,4 are 0's and rows below and down are 100, i should ignore/delete those fields i.e.,3 and 4... also, if the rows above and below are not equal to 100 like for rows 6,7,8,9... i should consider all the values while doing other calculations ..so i need a way to look at the above/below rows..

Id DID Value date 1 1 100 2010-03-10 2 1 100 2010-03-10 3 1 0 2010-03-10 4 1 0 2010-03-10 5 1 100 2010-03-10 6 1 100 2010-03-10 7 1 0 2010-03-10 8 1 0 2010-03-10
9 1 20 2010-03-10

CREATE TABLE ##temp ( ID INT IDENTITY(1,1), DID INT, Value INT, Date DateTime ) INSERT INTO ##temp (DID,Value,Date) SELECT 1,100,'2010-03-10 00:00:00' UNION ALL SELECT 1,100,'2010-03-10 00:05:00' UNION ALL SELECT 1,0,'2010-03-10 00:10:00' UNION ALL SELECT 1,0,'2010-03-10 00:15:00' UNION ALL SELECT 1,100,'2010-03-10 00:50:00' UNION ALL SELECT 1,100,'2010-03-10 10:00:00' UNION ALL SELECT 1,0,'2010-03-10 11:00:00' UNION ALL SELECT 1,0,'2010-03-10 11:05:00' UNION ALL SELECT 1,20,'2010-03-10 11:30:00' DROP TABLE ##Temp

Thanks in advance

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

LutzM SSC Guru Points: 107049 More actions · Answer 1

I would use the "quirky update" method to set a flag in a separate column (delete flag).

For details on how to use this method please see Jeffs great article[/url].

I consider it mandatory to completely read and follow the instructions / prerequisites as described in the article.

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

danny09 Say Hey Kid Points: 688 More actions · Answer 2

danny09

Say Hey Kid

Points: 688

June 22, 2010 at 1:51 pm

#1183581

I have seen the article before posting, but i am unable to utilize it to solve the issue, though it is related to my post.. Can you guide me. Thanks

LutzM SSC Guru Points: 107049 More actions · Answer 3

Let's try to clarify the business requirement:

What would be the expected output if you'd add a row with id=10 and value =100? What rows will be ignored? (my assumption: 3,4,7,8, and 9)

How would the result look like with an additional row ID=0, value =10?

The longer I think about the problem the more I think it's an issue of dealing with min(id) and max(id) for value=100 and how to deal with the rows in between.

The "quirky update" solution might not be the best solution in this case...

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

danny09 Say Hey Kid Points: 688 More actions · Answer 4

The complete business requirement is described in this post

http://qa.sqlservercentral.com/Forums/Topic940069-391-1.aspx?Update=1

I was able to finish most part of it, except that i am unable to apply this rule.. Records should be ignored if and only if the 2 consecutive records above and below equal 100 .. so, even if you add row with Id 10 only 3,4 rows should be ignored. You will need to consider all other rows including row 10 for other calculations.. because row 9 value = 20. All these calculation should be performed per DID basis.. so, If you have a DID =0, we would have to do the same for 0 also..

Hopefully, my post is clear

Thanks

LutzM SSC Guru Points: 107049 More actions · Answer 5

Posting your problem in multiple threads doesn't only waste resources, it also slows down the time until you get an answer that will actually solve your issue (especially if it is "urgent").

I need some time to read through the parallel thread you mentioned. So it will take a while before I post back.

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Stefan_G SSCertifiable Points: 6609 More actions · Answer 6

Your description of the requirements is really hard to understand, but the way I interpret you this should work:

select * -- replace with delete to actually delete



from ##temp t1

where

	value = 0

	and (select top 1 value from ##temp t2 where t2.did = t1.did and t2.id < t1.id and t2.value <> 0 order by id desc) = 100

	and (select top 1 value from ##temp t2 where t2.did = t1.did and t2.id > t1.id and t2.value <> 0 order by id asc)  = 100

Let me know if it is not doing what it should.

Stefan_G SSCertifiable Points: 6609 More actions · Answer 7

After reading your other post a bit more carefully I now think that what you really want is something like this:

CREATE TABLE ##temp



(

ID INT IDENTITY(1,1),

DID INT,

Value INT,

Date DateTime

)

INSERT INTO ##temp (DID,Value,Date)

SELECT 1,100,'2010-03-10 00:00:00' UNION ALL

SELECT 1,100,'2010-03-10 00:05:00' UNION ALL

SELECT 1,0,'2010-03-10 00:10:00' UNION ALL

SELECT 1,null,'2010-03-10 00:15:00' UNION ALL

SELECT 1,100,'2010-03-10 00:50:00' UNION ALL

SELECT 1,100,'2010-03-10 10:00:00' UNION ALL

SELECT 1,0,'2010-03-10 11:00:00' UNION ALL

SELECT 1,100,'2010-03-10 11:02:00' UNION ALL

SELECT 1,0,'2010-03-10  11:05:00' UNION ALL

SELECT 1,0,'2010-03-10 11:30:00'

select * from ##temp

select *

from ##temp t1

where

	isnull(value,0) = 0

	-- check that the two previous not-zero rows with the same DID has the value 100

	and (select count(*) from (select top 2 value from ##temp t2 where t2.did=t1.did and t2.id < t1.id and isnull(t2.value,0)<>0 order by id desc) dt where value = 100) = 2

	-- check that the two following not-zero rows with the same DID has the value 100

	and (select count(*) from (select top 2 value from ##temp t2 where t2.did=t1.did and t2.id > t1.id and isnull(t2.value,0)<>0 order by id asc)  dt where value = 100) = 2

DROP TABLE ##temp

Note that I have made some changes to your example data to expose the difference between my previous version and this version.

danny09 Say Hey Kid Points: 688 More actions · Answer 8

Posting your problem in multiple threads doesn't only waste resources, it also slows down the time until you get an answer that will actually solve your issue (especially if it is "urgent").

I actually posted the complete business req in my 1st post but there was no response and almost got the solution except for this part.. so had to create another post .. 🙁

danny09 Say Hey Kid Points: 688 More actions · Answer 9

Thanks a lot stefan, You are a life saver.. it almost worked until i executed it against the below example.. Since none of the two Values above and below are equal to 100.. Only rows 16 and 17 should be ignored but when i run this query it actually

Delete's(when replaced select with delete) row 4,6,8 which should be considered for further calculations apart from row 16,17.. Help me..Thanks Again



INSERT INTO ##temp (DID,Value,Date)

SELECT 1,100,'2010-03-10 00:00:00' UNION ALL

SELECT 1,100,'2010-03-10 00:05:00' UNION ALL

SELECT 1,100,'2010-03-10 00:10:00' UNION ALL

SELECT 1,0,'2010-03-10 00:15:00' UNION ALL

SELECT 1,100,'2010-03-10 00:50:00' UNION ALL

SELECT 1,0,'2010-03-10 10:00:00' UNION ALL

SELECT 1,100,'2010-03-10 11:00:00' UNION ALL

SELECT 1,0,'2010-03-10 11:02:00' UNION ALL

SELECT 1,100,'2010-03-10 11:02:00' UNION ALL

SELECT 1,0,'2010-03-10  11:05:00' UNION ALL

SELECT 1,0,'2010-03-10 11:30:00' UNION ALL

SELECT 1,100,'2010-03-10 00:00:00' UNION ALL

SELECT 1,10,'2010-03-10 00:05:00' UNION ALL

SELECT 1,100,'2010-03-10 00:05:00' UNION ALL

SELECT 1,100,'2010-03-10 00:10:00' UNION ALL

SELECT 1,0,'2010-03-10 00:15:00' UNION ALL

SELECT 1,0,'2010-03-10 11:30:00' UNION ALL

SELECT 1,100,'2010-03-10 00:10:00' UNION ALL

SELECT 1,100,'2010-03-10 00:00:00'

Stefan_G SSCertifiable Points: 6609 More actions · Answer 10

Hmm you are right.

Try this instead:

;with cte as (



	select

		t1.*,

		-- get the id of the previous non-zero row

		(select top 1 id from ##temp t2 where t2.did=t1.did and t2.id < t1.id and isnull(t2.value,0)<>0 order by id desc) as id1,

		-- get the id of the following non-zero row

		(select top 1 id from ##temp t2 where t2.did=t1.did and t2.id > t1.id and isnull(t2.value,0)<>0 order by id asc)  as id2

	from ##temp t1

	-- we are only interested in rows where value is 0 or null

	where isnull(t1.value,0) = 0

)

delete from ##temp where id in (

	select c.id

	from cte c

	-- join with the four rows that we are interested in

	join ##temp t1 on t1.id = c.id1

	join ##temp t2 on t2.id = c.id1-1 and t2.did = c.did

	join ##temp t3 on t3.id = c.id2

	join ##temp t4 on t4.id = c.id2+1 and t4.did = c.did

	-- check if all values are 100

	where t1.value=100 and t2.value=100 and t3.value=100 and t4.value=100

)

To get decent performance if the table is large you should make sure that you have a clustered index on the id column.

danny09 Say Hey Kid Points: 688 More actions · Answer 11

Thanks a ton, that works great.. i have changed it accordingly.. Now the only thing i am worried about is Performance because this query is just a part of the REQ and there is a lot of other calculations and its executed against 15 Million records daily.. did not test it against the whole set..But will let you know, once i completely test it..

Thanks a lot again