Help with and update statement for duplication records (I dont want to deleted duplicates)

  • I have a table that contains duplicate entries. I don’t want to delete the entries but update all of them except 1.

    For example (This example has been simplified and my table is larger with thousands of records)

    Imagine a table with 3 columns as per the below. Note: deleted Column can be 1 or 0 (yes or no), this is the column I’m trying to update.

    ID Name deleted

    1 Orange 0

    2 Orange 0

    3 RED 0

    4 RED 0

    5 RED 0

    I want all the duplicate records ‘deleted column’ to be updated to 1, so I would expect the output to be as follows. It doesn’t matter what entries is updated.

    ID Name deleted

    1 Orange 0

    2 Orange 1

    3 RED 0

    4 RED 1

    5 RED 1

    I can only find references to delete duplicates. Can anyone help?

  • DECLARE @T TABLE(ID INT, Name VARCHAR(10), deleted INT)

    INSERT INTO @T(ID,Name)

    SELECT 1, 'Orange' UNION ALL

    SELECT 2, 'Orange' UNION ALL

    SELECT 3, 'RED' UNION ALL

    SELECT 4, 'RED' UNION ALL

    SELECT 5, 'RED';

    WITH CTE AS (

    SELECT ID,Name,deleted,

    ROW_NUMBER() OVER(PARTITION BY Name ORDER BY ID) AS rn

    FROM @T)

    UPDATE CTE

    SET deleted=CASE WHEN rn=1 THEN 0 ELSE 1 END;

    SELECT * FROM @T ORDER BY ID;

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • Hi, not sure that will work as this table has thousands of records and your query lists the entries...assuming I'm reading it correctly.

    this is where I have managed to get to regarding my problem.

    I'm finding the duplicates using the following query.

    select *

    from mytablename

    where

    (name in

    (select name

    from mytablename as mytablename_1

    group by name

    having (count(name) > 1)))

    order by name desc

    this returns all duplicates. Im trying to change this to update the 'deleted' column in this table from 0 to 1 for all the duplicate entries, except one that will remain a 0.

    the query I have, but doesn't work correctly is

    update mytablename

    set deleted = '1'

    where

    (name in

    (select name

    from mytablename as mytablename_1

    group by name

    having (count(name) > 1)))

    but this basically updates ALL duplicates with a 1 in the 'deleted column' I only require it to update all duplicates except 1 that will keep a 0 in the deleted column.

    Any ideas?

  • I had some help with this, and this seems to work.

    update mytablename

    set deleted = '1'

    where

    (ID in

    (select MIN(ID)

    from mytablename as mytablename_1

    group by name, deleted

    having (count(name) > 1)))

  • Not sure you understood myquery, but this should work

    WITH CTE AS (

    SELECT *,

    ROW_NUMBER() OVER(PARTITION BY Name ORDER BY ID) AS rn

    FROM mytablename)

    UPDATE CTE

    SET deleted=1

    WHERE rn>1;

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply