Select Distinct for 4 columns but return all columns.. how?

  • dear friends,

    I have a table with too manly fields. there are duplicate records in in rows BUT ONLY for some fields.

    I need to use DISTINCT on 4 columns and then return all rows.

    or at least I should get ID value of records in result so I can real all columns with code.

    lets say:

    col1 col2 col3 col4 col5 col6

    a1 AA BB a4 CC a6

    b1 b2 b3 b4 CC b6

    c1 AA BB c4 CC c6

    result must be:

    a1 AA BB a4 CC a6

    b1 b2 b3 b4 CC b6

    here's test code:

    CREATE TABLE [dbo].[Table_2](

    [ID] [int] IDENTITY(1,1) NOT NULL,

    [col1] [nvarchar](50) NULL,

    [col2] [nvarchar](50) NULL,

    [col3] [nvarchar](50) NULL,

    [col4] [nvarchar](50) NULL,

    [col5] [nvarchar](50) NULL,

    [col6] [nvarchar](50) NULL

    ) ON [PRIMARY]

    insert into table_1 (col1,col2,col3,col4,col5,col6) values ('a1','AA','BB','a4','CC','a6')

    insert into table_1 (col1,col2,col3,col4,col5,col6) values ('b1','b2','b3','b4','b5','b6')

    insert into table_1 (col1,col2,col3,col4,col5,col6) values ('c1','AA','BB','c4','CC','c6')

  • Your test preparation code yields errors (you create one table then insert to another).

    If I understood you correctly, you need to identify duplicates by subset of columns,

    and then display all that duplicate rows.

    Grouping functions like COUNT can also be used with OVER() clause and have completely different effect.

    Run this:

    select *,

    Duplicates = COUNT(*) OVER(partition by col2, col3, col5)

    from Table_2

    Result:

    ID col1 col2 col3 col4 col5 col6 Duplicates

    1 a1 AA BB a4 CC a6 2

    3 c1 AA BB c4 CC c6 2

    2 b1 b2 b3 b4 b5 b6 1

    So, the solution is:

    SELECT ID, col1,col2,col3,col4,col5,col6

    FROM

    ( select *,

    Duplicates = COUNT(*) OVER(partition by col2, col3, col5)

    from Table_2

    ) t

    WHERE t.Duplicates > 1

    Result:

    ID col1 col2 col3 col4 col5 col6

    1 a1 AA BB a4 CC a6

    3 c1 AA BB c4 CC c6

    Inline view is used because OVER() by definition executes just before ORDER BY, so you can't use it in e.g. WHERE because it doesn't exist yet.

    HTH

    _____________________________________________________
    Microsoft Certified Master: SQL Server 2008
    XDetails Addin - for SQL Developers
    blog.sqlxdetails.com - Transaction log myths
  • If you need just first row of each group:

    SELECT ID, col1,col2,col3,col4,col5,col6

    FROM

    ( select *,

    Rnr = ROW_NUMBER() OVER(partition by col2, col3, col5 ORDER BY ID)

    from Table_2

    ) t

    WHERE t.Rnr = 1

    Result:

    ID col1 col2 col3 col4 col5 col6

    1 a1 AA BB a4 CC a6

    2 b1 b2 b3 b4 b5 b6

    _____________________________________________________
    Microsoft Certified Master: SQL Server 2008
    XDetails Addin - for SQL Developers
    blog.sqlxdetails.com - Transaction log myths
  • my friend, thats exactly what I need.

    you helped me so much. thanks..

  • I'm glad to help, you are welcome.

    _____________________________________________________
    Microsoft Certified Master: SQL Server 2008
    XDetails Addin - for SQL Developers
    blog.sqlxdetails.com - Transaction log myths

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply