Identifying column data that is the same in multiple records

Question

Post reply

Identifying column data that is the same in multiple records

pwilly

SSChasing Mays

Points: 643
More actions
January 22, 2009 at 12:40 pm

#133727

Hello,

I have a database that holds various info about people. First names, last names, nicknames, who entered them into the system, etc. I am trying to find duplicates in the system, people that were entered more than once. For instance one user may have entered John Smith while another user may have entered the same person as John Wayne Smith. I believe the following script correctly identifies all of the names that appear more than once:

SELECT lastname, firstname, COUNT(lastname) AS '# of matching Lastnames', COUNT(firstname) AS '# of matching Firstnames' FROM subject GROUP BY lastname, firstname HAVING COUNT(*) > 1

When I run this I end up with 67 rows returned. My problem now is I would like to be able to list additional information about the 67 people returned from the query above such as date of birth and nicknames but this additional info resides in other tables in the database so I would have to cross reference with join statements which I did:

SELECT lastname, firstname, birthdate,COUNT(lastname) AS '# of matching Lastnames', COUNT(firstname) AS '# of matching Firstnames' FROM subject INNER JOIN subjectdob b ON subject.subjectid = b.subjectid GROUP BY lastname, firstname, birthdate HAVING COUNT(*) > 1

But when I run this I end up getting only 19 rows returned. Most of these rows are do not even match the rows that were returned in the first query. Some of the people returned from the first query are returned from this one but many were not plus some people returned from this query were not returned from the first one. I am assuming that it is looking for multiple instances of lastname, firstname AND birthdate but I am not interested in finding matching birthdates I just want to know what the birthdate is for the records where the last name and first name both happen to match.

Is it possible to get additional info from other tables returned for the first query above without screwing up the results as I obviously did unintentionally in the second query?

Thanks

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply

Ray M SSC-Insane Points: 21093 More actions · Answer 1

Use that table that you found the duplicates in as a derived table, Join to it on the fields you believe are the unique ones back to the original table(s)

Select *

FROM subject x

INNER JOIN subjectdob y

ON subject.subjectid = b.subjectid

INNER JOIN (SELECT lastname, firstname, birthdate,COUNT(lastname) AS '# of matching Lastnames', COUNT(firstname) AS '# of matching Firstnames'

FROM subject

INNER JOIN subjectdob b

ON subject.subjectid = b.subjectid

GROUP BY lastname, firstname, birthdate

HAVING COUNT(*) > 1) AS dt

ON x.lastname = dt.lastname

and x.firstname = dt.firstname

and x.birthdate = dt.birthdate

pwilly SSChasing Mays Points: 643 More actions · Answer 2

pwilly

SSChasing Mays

Points: 643

January 22, 2009 at 1:35 pm

#931439

Thanks Ray! I have it working like a charm now!