Extract duplicated records into new table!

  • Hello all,

    I have a big data table that contain duplicated records. I would like to extract all duplicated SSN records with other datas into a new table. The data table as following:

    Tablename: CustomerRecords

    SSN | FullName | Purchase Date .....

    1112223333 | John Smith | 02/12/2007

    5556667777 | Allen Lee | 12/03/2006

    1112223333 | John Smith | 05/06/2005

    9990001111 | Julie Ford | 12/09/2003

    1112223333 | John Smith | 08/02/2004

    .

    .

    .

    I knew my table having duplicated records by doing the bellowed query:

    SELECT SSN, COUNT (SSN) AS NumOccurrences

    FROM CustomerRecords

    GROUP BY SSN

    HAVING (COUNT(SSN) > 1)

    Any help either show me the TSQL for Select duplicated SSN records into new table or Delete the unique SSN records of CustomerRecords is much appreciated. Thanks in advance.

    Edited: Today @ 4:44 AM by ATroung

  • I got the answer from other TSQL for SS2K5. The solution was

    SELECT *

    FROM CustomerRecords

    WHERE SSN IN

    (SELECT SSN

    FROM CustomerRecords

    GROUP BY SSN

    HAVING (COUNT(SSN) > 1))

    ORDER BY SSN

  • Don't know if this is usefull for you but the following script allows you to find the oldest version of the record and when you know that you could delete that one in the original table... This is an example with a temp table... you will need to edit the script to your own table... Hope it helps.

    --=============

    drop table #temp

    create table #temp (id int, [name] varchar(256), date datetime, crrntdate datetime, datesec bigint, crrntdatesec bigint)

    insert into #temp( id, [name], date)

    select 1, 'test', getdate()

    union all

    select 1, 'test', dateadd(year, -1, getdate())

    update #temp set crrntdate = getdate()

    update #temp set datesec = CAST((CAST(date AS DECIMAL(19,9)) * 86400.0) AS BIGINT),

    crrntdatesec = CAST((CAST(crrntdate AS DECIMAL(19,9)) * 86400.0) AS BIGINT)

    select *

    from #temp t, (select id, [name], max(crrntdatesec - datesec) diff

    from #temp

    group by id, name) a

    where (t.crrntdatesec - t.datesec) = a.diff

    and t.id = a.id

    --==========

  • Hi,

    Your sample SQL did not relate to my question at all. Anyway, thank for your attention.

  • SELECT SSN, COUNT (SSN) AS NumOccurrences

    FROM CustomerRecords

    GROUP BY SSN

    HAVING (COUNT(SSN) > 1)

    Any help either show me the TSQL for Select duplicated SSN records into new table or Delete the unique SSN records of CustomerRecords is much appreciated. Thanks in advance.

    Select * into newTable

    from CustomerRecords

    where SSN in (SELECT SSN FROM CustomerRecords GROUP BY SSN HAVING (COUNT(SSN) > 1))

    select * from newTable


    Ronald San Juan | SQL DBA
    ID 710124 ~ Code "Northwind"

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply