Delete duplicate data

  • Ok, so I have a table where some records have been inserted twice. However, the table has an identity field in the primary key, so the records are the same except for the identity field.

    How do I delete records that are duplicates, while maintaining at least one record? Is this possible to do in a single query?

    I can use this select query to get one copy of each duplicated record, but I can't use it to delete anything because the query won't identify a particular record

    SELECT DISTINCT t1.field1 FROM mytable t1 JOIN mytable t2 ON t1.IdentityKey<>t2.IdentityKey AND t1.field1=t2.field1

    I have a feeling I'm starting to go down the wrong path.

    ---
    Dlongnecker

  • You're close ...

    DECLARE @mytable TABLE (IdentityKey INT IDENTITY(1,1), field1 VARCHAR(10))

    INSERT @mytable

    select 'test1' UNION ALL

    select 'test1' UNION ALL

    select 'test2' UNION ALL

    select 'test3' UNION ALL

    select 'test3'

    SELECT * FROM @mytable

    DELETE FROM t1

    FROM @mytable AS t1

    INNER JOIN @mytable AS t2

    ON t1.IdentityKey < t2.IdentityKey

    AND t1.field1 = t2.field1

    SELECT * FROM @mytable

    ______________________________________________________________________

    Personal Motto: Why push the envelope when you can just open it?

    If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.

    Jason L. Selburg
  • Thanks Jason! That less than on the identity really blew my mind.

    ---
    Dlongnecker

  • Sometimes, just stepping back from the problem, walking around behind it and looking from the other side helps. 😀

    ______________________________________________________________________

    Personal Motto: Why push the envelope when you can just open it?

    If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.

    Jason L. Selburg
  • Hi,

    You can even write the query in below fashion.

    DELETE FROM

    table_name A

    WHERE

    a.IdentityColumn <

    ANY (

    SELECT

    B.IdentityColumn

    FROM

    table_name B

    WHERE

    A.col1 = B.col1

    AND

    A.col2 = B.col2

    );

    Thanks -- Vj

    http://dotnetvj.blogspot.com

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply