Help avoiding RBAR

Question

Post reply

Help avoiding RBAR

FNS

SSCrazy

Points: 2015
More actions
May 19, 2009 at 5:06 am

#200652

There must be a set based answer to this ugly situation but my brain isn't finding it and time is running out. So I'm appealing for help.

There are some duplicates names in a table (without a primary key and 600,000 rows) which are distinguished by date. I need to delete all but the newest dated record for each person. Ie the person can become the primary key.

I can easily get a list of those names which are duplicated into a temporary table, but it's cycling through these names to get the MAX date for each one that's causing me to run to the WHILE loop. Please help me avoid the RBAR solution if possible. If it's not then do I use a CURSOR or stick with my C# programming and use a WHILE loop.

CREATE TABLE #Table1

(

NameOfSomeone nvarchar(100),

DateOfSomeone datetime

)

INSERT INTO #Table1 (NameOfSomeone, DateOfSomeone) VALUES ('Paul', '2009-01-01')

INSERT INTO #Table1 (NameOfSomeone, DateOfSomeone) VALUES ('Paul', '2009-01-02')

INSERT INTO #Table1 (NameOfSomeone, DateOfSomeone) VALUES ('Peter', '2009-01-03')

INSERT INTO #Table1 (NameOfSomeone, DateOfSomeone) VALUES ('Peter', '2009-01-04')

INSERT INTO #Table1 (NameOfSomeone, DateOfSomeone) VALUES ('Peter', '2009-01-07')

INSERT INTO #Table1 (NameOfSomeone, DateOfSomeone) VALUES ('James', '2009-01-05')

INSERT INTO #Table1 (NameOfSomeone, DateOfSomeone) VALUES ('James', '2009-01-06')

INSERT INTO #Table1 (NameOfSomeone, DateOfSomeone) VALUES ('Amy', '2009-01-01')

INSERT INTO #Table1 (NameOfSomeone, DateOfSomeone) VALUES ('Grace', '2009-01-01')

INSERT INTO #Table1 (NameOfSomeone, DateOfSomeone) VALUES ('Jane', '2009-01-01')

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

Ramesh Saive SSC-Insane Points: 24275 More actions · Answer 1

There are many methods by which you can delete duplicate rows from one table. Here is one of the ways you can do it.

It can also be done using temporary table with IDENTITY column and inserting the data into this table in

sorted order as explained here



IF ( OBJECT_ID( 'tempdb..#Table1' ) IS NOT NULL )

	DROP TABLE #Table1

CREATE TABLE #Table1

(

NameOfSomeone nvarchar(100),

DateOfSomeone datetime

)

INSERT INTO #Table1 (NameOfSomeone, DateOfSomeone) VALUES ('Paul', '2009-01-01')

INSERT INTO #Table1 (NameOfSomeone, DateOfSomeone) VALUES ('Paul', '2009-01-02')

INSERT INTO #Table1 (NameOfSomeone, DateOfSomeone) VALUES ('Peter', '2009-01-03')

INSERT INTO #Table1 (NameOfSomeone, DateOfSomeone) VALUES ('Peter', '2009-01-04')

INSERT INTO #Table1 (NameOfSomeone, DateOfSomeone) VALUES ('Peter', '2009-01-07')

INSERT INTO #Table1 (NameOfSomeone, DateOfSomeone) VALUES ('James', '2009-01-05')

INSERT INTO #Table1 (NameOfSomeone, DateOfSomeone) VALUES ('James', '2009-01-06')

INSERT INTO #Table1 (NameOfSomeone, DateOfSomeone) VALUES ('Amy', '2009-01-01')

INSERT INTO #Table1 (NameOfSomeone, DateOfSomeone) VALUES ('Grace', '2009-01-01')

INSERT INTO #Table1 (NameOfSomeone, DateOfSomeone) VALUES ('Jane', '2009-01-01')

DELETE	T1

FROM	#Table1 T1

		INNER JOIN

		(

			SELECT	NameOfSomeone, MAX( DateOfSomeone ) AS DateOfSomeone

			FROM	#Table1

			GROUP BY NameOfSomeone

		) T2 ON T1.NameOfSomeone = T2.NameOfSomeone	AND T1.DateOfSomeone != T2.DateOfSomeone

SELECT * FROM #Table1

--Ramesh

Tom257 SSChampion Points: 11475 More actions · Answer 2

---- Edit ----------------------------------------------------------

Sorry ignore this: I've just noticed its the SQL2000 group, and ROW_NUMBER() isn't available. But I've left it here for information only.

-------------------------------------------------------------------

Another de-duping method, using the ROW_NUMBER PARTITION BY feature.



WITH DUPS AS (

  SELECT NameOfSomeone, DateOfSomeone, ROW_NUMBER() OVER (PARTITION BY NameOfSomeone ORDER BY DateOfSomeone DESC) AS RN

  FROM #Table1

)

DELETE FROM #Table1

FROM #Table1 JOIN DUPS

ON #Table1.NameOfSomeone = DUPS.NameOfSomeone AND

#Table1.DateOfSomeone = DUPS.DateOfSomeone

WHERE DUPS.RN > 1

This creates A CTE with each name in a separate group (partition) - and the dates are numbered in descending sequence within each group, so you can delete all those with a row number greather than 1.

As with any potentially dangerous changes, try it out in a transaction with rollback first.

Ramesh Saive SSC-Insane Points: 24275 More actions · Answer 3

Ramesh Saive

SSC-Insane

Points: 24275

May 19, 2009 at 6:16 am

#996863

Tom Brown (5/19/2009)

Another de-duping method, using the ROW_NUMBER PARTITION BY feature.

It does work in 2K5 but we are in 2K forum. It happened to me many times in the past:-D

--Ramesh

FNS SSCrazy Points: 2015 More actions · Answer 4

A huge thankyou to all who replied. 😀 I was so busy looking at memory tables and then joining them with the original I lost sight of the root cause - cr*p data.

I've run the script by SSCrazy and it worked a treat! :w00t: The 3rd party's data is now in a fit state to be imported into live, the boss is happy and I can get to work on the next stages of integration.

Now where's the emoticon for a beer...

Tom257 SSChampion Points: 11475 More actions · Answer 5

Tom257

SSChampion

Points: 11475

May 19, 2009 at 6:52 am

#996898

FNS (5/19/2009)

Now where's the emoticon for a beer...

Steve ... ?

We could do with a few more emoticons

these ones don't display properly :beer: :rocks: :thumbsup:

Ramesh Saive SSC-Insane Points: 24275 More actions · Answer 6

FNS (5/19/2009)

A huge thankyou to all who replied. 😀 I was so busy looking at memory tables and then joining them with the original I lost sight of the root cause - cr*p data.

I've run the script by SSCrazy and it worked a treat! :w00t: The 3rd party's data is now in a fit state to be imported into live, the boss is happy and I can get to work on the next stages of integration.

Now where's the emoticon for a beer...

You are welcome, and I am glad that it worked out well for you. "BeerIcon", I wandered around for years but still couldn't find one:w00t::w00t:

BTW, I am not SSCrazy:hehe:

--Ramesh