How to Remove duplicate records (rows)

  • There is a table, with a primary key with the data type of uniqueidentifier and the rest of the columns are of data type varchar, i am using the following query to remove duplicates:

    DELETE

    FROM dbo.SMRights

    WHERE gRightSID NOT IN

    (

    SELECT MAX(gRightSID)

    FROM dbo.SMRights

    GROUP BY DuplicateColumn1, DuplicateColumn2, DuplicateColumn3,Duplicatecolumn4,Duplicatecolumn5)

    The table name is SMRights and the uniqueidentifier column is gRightSID, the other columns are the ones which have duplicate rows against the gRightSID.

    Duplicatecolumn1 2 and 3 are of data type varchar

    Duplicatecolumn4 and 5 are of data type text

    When I run this I get a message ::: Operand data type uniqueidentifier is invalid for max operator.

    Operand data type uniqueidentifier is invalid for max operator.

    Please let me know how i can remove the duplicate records.

    Thanks & Best Regards

  • DELETE

    FROM dbo.SMRights

    WHERE gRightSID NOT IN

    (

    SELECT a.gRightSID

    FROM (select row_number() over( prtition by DuplicateColumn1, DuplicateColumn2, DuplicateColumn3,Duplicatecolumn4,Duplicatecolumn5 order by Duplicatecolumn5) as rownum, gRightSID from dbo.SMRights ) a where a.rownum=2)

    hope it works.

    There is always something new to learn.
    My personal SQL Blog[/url]

  • ;WITH SequencedTable AS (

    SELECT GroupID = ROW_NUMBER() OVER (PARTITION BY DuplicateColumn1, DuplicateColumn2, ...)

    FROM dbo.SMRights)

    DELETE FROM SequencedTable WHERE GroupID > 1

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Hi gheorghiu_horatiu,

    There is a problem at the last statement near the where clause

    a where a.rownum=2)

  • Guys, thank you for your reply,

    But i am getting this error:

    The text, ntext, and image data types cannot be compared or sorted, except when using IS NULL or LIKE operator.

    Msg 306, Level 16, State 2, Line 7

    The text, ntext, and image data types cannot be compared or sorted, except when using IS NULL or LIKE operator.

    Its so confusing and i am stuck 🙁

  • You could run that multiple times to clear all rows, which is something to do if you are managing log sizes or worried about activity with a large transactions.

  • wmalik001 (5/20/2010)


    Guys, thank you for your reply,

    But i am getting this error:

    The text, ntext, and image data types cannot be compared or sorted, except when using IS NULL or LIKE operator.

    Msg 306, Level 16, State 2, Line 7

    The text, ntext, and image data types cannot be compared or sorted, except when using IS NULL or LIKE operator.

    Its so confusing and i am stuck 🙁

    Please provide the script for table SMRights

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • So you are saying that these error messages are ignorable? and I should keep running the query and it will remove the duplicates from the table?

  • No, those error messages are based on your data types. You can't submit those commands on those types.

    Please post some DDL for the tables.

  • Ok the first one i used is :

    DELETE

    FROM dbo.SMRights

    WHERE gRightSID NOT IN

    (

    SELECT a.gRightSID

    FROM (select row_number() over( partition by cRightID,cDesc,cDataDef,cGroup,cHelp,cDataHelp

    order by cHelp) as rownum, gRightSID from dbo.SMRights)

    a where a.rownum=2)

    I got the error :

    Msg 306, Level 16, State 2, Line 6

    The text, ntext, and image data types cannot be compared or sorted, except when using IS NULL or LIKE operator.

    Msg 306, Level 16, State 2, Line 6

    The text, ntext, and image data types cannot be compared or sorted, except when using IS NULL or LIKE operator.

    Msg 306, Level 16, State 2, Line 7

    The text, ntext, and image data types cannot be compared or sorted, except when using IS NULL or LIKE operator.

    Then i used Chris's script

    WITH SequencedTable AS (

    SELECT GroupID = ROW_NUMBER() OVER (partition by cRightID,cDesc,cDataDef,cGroup,cHelp,cDataHelp

    order by cHelp)

    FROM dbo.SMRights)

    DELETE FROM SequencedTable WHERE GroupID > 1

    And got the same errors as above;

    The table(SMRights) is as follows:

    Columns:

    gRightSID(PK,uniqueidentifier,not null)

    cRightID(varchar(50),not null)

    cDesc(varchar(50),not null)

    cDataDef(varchar(1024),not null)

    cHelp(text,not null)

    cDataHelp(text,not null)

  • All i want to do is to remove the duplicate records from these columns:

    Columns:

    gRightSID(PK,uniqueidentifier,not null) ----> this one has not duplicate record

    cRightID(varchar(50),not null)---> if this column has duplicate records then delete that entire row

    cDesc(varchar(50),not null)

    cDataDef(varchar(1024),not null)

    cHelp(text,not null)

    cDataHelp(text,not null)

  • ;WITH SequencedTable AS (

    SELECT GroupID = ROW_NUMBER() OVER (PARTITION BY cRightID ORDER BY cDesc)

    FROM dbo.SMRights)

    DELETE FROM SequencedTable WHERE GroupID > 1

    -- Note that the choice of sort column is up to you!

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thank You Chris, looks like this is the solution, once again thank you so much for your time in helping me out, your the man!

  • wmalik001, you should not use text data type, unless you use sql server 2000. You should use varchar(max)...

    There is always something new to learn.
    My personal SQL Blog[/url]

  • Hello Friends.......

    To delete similar records, i.e. where the records are not the same but one field is the same and only one copy needs to be preserved, try the following SQL:

    delete T1

    from MyTable T1, MyTable T2

    where T1.dupField = T2.dupField

    and T1.uniqueField > T2.uniqueField

    This will delete all records from the table MyTable which have the same value for the field dupField, leaving that record which has the lowest value in uniqueField.

    Thanks

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply