How Can I Select All But One Record?

  • Hi,

    The following SELECT gets a list of record where the ISMid and Date value(s) are the same. In fact, there should only be a single record for any ISMid and Date value. The additional entries are considered "bad data".

    Without using a cursor, how can I use an UPDATE statement to set the ISMid value to NULL for all duplicated values except one ? (setting the ISMid value to NULL will effectively safely remove the duplicates without actually deleting them)

    In the result set below, it just happens that there is only a maximum of two records for each ISMid/Date value. However, there could be more than two. So, I could try to Update and leave the Min(ENRid) or the Max(ENRid). However, I just cannot get my head around how to do this

    Any help would be very much appreciated.

     

    SELECT ENRid, ISMid, Date, Active FROM enrollment enr

     WHERE enr.ismid IN

     (SELECT x.ismid FROM

      (SELECT ismid, date FROM enrollment WHERE ismid IS NOT NULL AND active = 1

       GROUP BY ismid, date HAVING COUNT(*) > 1) x )

     AND enr.active = 1

    ENRid       ISMid       Date                                                   Active

    ----------- ----------- ------------------------------------------------------ ------

    14          14          2001-08-23 00:00:00.000                                1

    16          14          2001-08-23 00:00:00.000                                1

    24          23          1999-08-30 00:00:00.000                                1

    26          23          1999-08-30 00:00:00.000                                1

    70          68          1999-08-30 00:00:00.000                                1

    72          68          1999-08-30 00:00:00.000                                1

    80          77          1999-08-30 00:00:00.000                                1

    82          77          1999-08-30 00:00:00.000                                1

    97          92          2001-08-22 00:00:00.000                                1

    99          92          2001-08-22 00:00:00.000                                1

    210         201         2001-10-25 00:00:00.000                                1

    212         201         2001-10-25 00:00:00.000                                1

    214         203         2001-09-04 00:00:00.000                                1

    216         203         2001-09-04 00:00:00.000                                1

  • Assuming that the column EnRID is a unique row id, try this:

    UPDATE Enrollment SET ISMID = NULL

     WHERE Enrid IN (SELECT MIN(E.ENRID)

                         FROM ENROLLMENT E INNER JOIN

                                         (SELECT ismid, Mydate

                                            FROM enrollment

                                           WHERE ismid IS NOT NULL AND active = 1

                                           GROUP BY ismid, Mydate

                                          HAVING COUNT(*) > 1) x ON E.IsMid = X.IsMid AND E.MyDate = X.MyDate

                                  GROUP BY E.IsMid, E.MyDate

                                 )

       AND ACTIVE = 1

    Have a great day

    Wayne

  • Hi Wayne,

    Its pretty late over here so I haven't had a chance to test for total accuracy but after running your code it seems to work really well ! However, what surprised me is that the update "removed" all the duplicates except one pair. I had to run the code again to get rid of the final pair of records?

    Any ideas why this might have happened?

    Thanks for the great code

    Regards.

     

     

  • Without checking the orignal data, I cannot be certain.  However, my first thought would be that there was a case of three rows instead of just the two you expected.  By changing the logic a bit to read

     WHERE Enrid NOT IN (SELECT MIN(E.ENRID)

     we could have gotten all of them in a single run.

    Happy to help

    Have a great day!

    Wayne

  • The following code assumes that ENRid is a UniqueIdentifier:

     

    Update

       Enrollment

    Set

       ISMid = NULL

    From

       Enrollment enr1

    Where

       enr1.ENRid >

          (

           Select

              Min(ENRid)

           From

              Enrollment enr2

           Where

              enr2.ISMid = enr1.ISMid

          )

  • I think this would work:

    UPDATE Enrollment

    SET ISMID = NULL

    WHERE ENRid NOT IN (

      SELECT MIN(E.ENRid)

      FROM Enrollment

      WHERE NOT ISMid IS NULL

      GROUP BY ISMid, [Date]

    )

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply