query joining multiple tables getting duplicates- how to stop

  • I'm joining several tables and when I add the last one I get duplicate results. How can I get just one for each? Thanks for any ideas.

    select a.field, b.field, c.field

    from atblname as a inner join btblname as b on a.id = b.parent_id

    left outer join ctblname as c on a.id = c.parent_id

    There are more than one result when joining tbl a and c, but I'm getting a reult for each of them for all results from joining a and b.

  • Hi!

    From a quick look (and unless I have missed something) I think you need only to change SELECT to SELECT DISTINCT in your query to solve the problem.

    James

  • Denise McMillan (5/17/2013)


    I'm joining several tables and when I add the last one I get duplicate results. How can I get just one for each? Thanks for any ideas.

    select a.field, b.field, c.field

    from atblname as a inner join btblname as b on a.id = b.parent_id

    left outer join ctblname as c on a.id = c.parent_id

    There are more than one result when joining tbl a and c, but I'm getting a reult for each of them for all results from joining a and b.

    What is PK of ctblname? Basically your current join to table ctblname finds more than one record there with the same a.id = c.parent_id, so you got your results "duplicated".

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • I tried using distinct but one of the fields I need is text and so can't use it.

    Each table has an id PK only.

    I may try making use of a temp table.

    Thanks very much for the ideas.

  • I'm not sure what you mean by duplicate. If there is a one to many relationship between table a and table b (your inner join), then yes, you will get multiple records from table c where there is a match between a.id and c.id.

  • Hi!

    First option, you could change the column type to varchar(max) [text is, from my reading on here, to be depreciated].

    As that might be a bit drastic, you can 'type cast' the text column to varchar(max) - this will allow you to do a 'DISTINCT' and by using 'max' should ensure your text is not truncated

    Below is a simple example; 'strText' is a text column, the output column from the query will be 'strTextAsVC'

    select distinct

    CONVERT ( varchar(max), strText ) as strTextAsVC

    from

    tblTest

    James

  • I tried using the convert and distinct. It runs but still gives me all the duplicate records. I also tried putting the results of the join on the first 2 tables into a temp table, them joining the 3rd table but it still gave me the duplicates. I don't seeany other field to join on in the 3rd table.

    Maybe I need a subquery somehow.

    Thanks again for the ideas.

  • Denise McMillan (5/17/2013)


    I tried using the convert and distinct. It runs but still gives me all the duplicate records. I also tried putting the results of the join on the first 2 tables into a temp table, them joining the 3rd table but it still gave me the duplicates. I don't seeany other field to join on in the 3rd table.

    Maybe I need a subquery somehow.

    Thanks again for the ideas.

    Look at the following:

    create table tablea (

    id int,

    col1 varchar(10),

    col2 datetime

    );

    create table tableb (

    id int,

    col1 varchar(10),

    col2 datetime

    );

    create table tablec (

    id int,

    col1 varchar(10),

    col2 datetime

    );

    insert into tablea

    values (1,'Row1','2013-05-16 10:00:00'),(2,'Row2','2013-05-16 12:00:00'),(3,'Row3','2013-05-16 14:00:00');

    insert into tableb

    values (1,'Row1','2013-05-16 10:15:00'),(1,'Row2','2013-05-16 10:30:00'),(2,'Row3','2013-05-16 12:15:00'),(2,'Row4','2013-05-16 12:00:00'),(3,'Row5','2013-05-16 14:00:00');

    insert into tablec

    values (1,'Row1','2013-05-16 11:00:00'),(3,'Row2','2013-05-16 15:00:00');

    select

    a.id as AId,

    a.col1 as ACol1,

    a.col2 as ACol2,

    b.id as BId,

    b.col1 as BCol1,

    b.col2 as BCol2,

    c.id as CId,

    c.col1 as CCol1,

    c.col2 as CCol2

    from

    tablea a

    inner join tableb b

    on (a.id = b.id)

    left outer join tablec c

    on (a.id = c.id);

    drop table tablea;

    drop table tableb;

    drop table tablec;

    When you run the above you get the following results:

    AId ACol1 ACol2 BId BCol1 BCol2 CId CCol1 CCol2

    ----------- ---------- ----------------------- ----------- ---------- ----------------------- ----------- ---------- -----------------------

    1 Row1 2013-05-16 10:00:00.000 1 Row1 2013-05-16 10:15:00.000 1 Row1 2013-05-16 11:00:00.000

    1 Row1 2013-05-16 10:00:00.000 1 Row2 2013-05-16 10:30:00.000 1 Row1 2013-05-16 11:00:00.000

    2 Row2 2013-05-16 12:00:00.000 2 Row3 2013-05-16 12:15:00.000 NULL NULL NULL

    2 Row2 2013-05-16 12:00:00.000 2 Row4 2013-05-16 12:00:00.000 NULL NULL NULL

    3 Row3 2013-05-16 14:00:00.000 3 Row5 2013-05-16 14:00:00.000 3 Row2 2013-05-16 15:00:00.000

    The data from tablec is "duplicated" for a.id = c.id because there are 2 unique rows of data represented by a.id = b.id.

    Is this what you are talking about? What is it you actually want to see?

  • I think I can't get what I need because what is in the 3rd table is not related to any one item in the joined a and b tables. It is just once for all.

    Thanks very much to everyone who tried to help.

  • Denise McMillan (5/17/2013)


    I think I can't get what I need because what is in the 3rd table is not related to any one item in the joined a and b tables. It is just once for all.

    Thanks very much to everyone who tried to help.

    Me thinks you give up too easily.

    Is this sort of what you are looking for?

    create table tablea (

    id int,

    col1 varchar(10),

    col2 datetime

    );

    create table tableb (

    id int,

    col1 varchar(10),

    col2 datetime

    );

    create table tablec (

    id int,

    col1 varchar(10),

    col2 datetime

    );

    insert into tablea

    values (1,'Row1','2013-05-16 10:00:00'),(2,'Row2','2013-05-16 12:00:00'),(3,'Row3','2013-05-16 14:00:00');

    insert into tableb

    values (1,'Row1','2013-05-16 10:15:00'),(1,'Row2','2013-05-16 10:30:00'),(2,'Row3','2013-05-16 12:15:00'),(2,'Row4','2013-05-16 12:00:00'),(3,'Row5','2013-05-16 14:00:00');

    insert into tablec

    values (1,'Row1','2013-05-16 11:00:00'),(3,'Row2','2013-05-16 15:00:00');

    with TwoTables as (

    select

    a.id as AId,

    a.col1 as ACol1,

    a.col2 as ACol2,

    b.id as BId,

    b.col1 as BCol1,

    b.col2 as BCol2,

    rn = row_number() over (partition by a.id order by (select null))

    from

    tablea a

    inner join tableb b

    on (a.id = b.id)

    )

    select

    AId,

    ACol1,

    ACol2,

    BId,

    BCol1,

    BCol2,

    c.id as CId,

    c.col1 as CCol1,

    c.col2 as CCol2

    from

    TwoTables tt

    left outer join tablec c

    on (tt.AId = c.id

    and tt.rn = 1);

    drop table tablea;

    drop table tableb;

    drop table tablec;

    Results:

    AId ACol1 ACol2 BId BCol1 BCol2 CId CCol1 CCol2

    ----------- ---------- ----------------------- ----------- ---------- ----------------------- ----------- ---------- -----------------------

    1 Row1 2013-05-16 10:00:00.000 1 Row1 2013-05-16 10:15:00.000 1 Row1 2013-05-16 11:00:00.000

    1 Row1 2013-05-16 10:00:00.000 1 Row2 2013-05-16 10:30:00.000 NULL NULL NULL

    2 Row2 2013-05-16 12:00:00.000 2 Row3 2013-05-16 12:15:00.000 NULL NULL NULL

    2 Row2 2013-05-16 12:00:00.000 2 Row4 2013-05-16 12:00:00.000 NULL NULL NULL

    3 Row3 2013-05-16 14:00:00.000 3 Row5 2013-05-16 14:00:00.000 3 Row2 2013-05-16 15:00:00.000

  • Hi,

    I've tried your axample and it works great except for that i get the following results;

    Require;

    TableA

    TableB [one to many from TableA with an ID from Table A]

    TableC [One to many from TableA with an ID from Table A] - Table B & Table C has nothing to do with Each Other

    SQL query must show a record from Table A with ALL matching records from Table B as well as ALL matching records from table C.

    Currently [before trying your example] i get TableA record BUT Duplicate records from TableB and Duplicate records from TableC [Somehow table B & TableC affects the result.

    WITH YOUR EXAMPLE CODE i get TableC 100% correct [No Duplicated values]

    TableB however, duplicates on either 1 of the valus returned.

    from

    TwoTables tt

    left outer join QuickSpecMultiple c

    on (tt.AId = c.id

    and tt.rn = 1);

    Changing the tt.rn = 1 to 2, moves the duplication to record 2, to 3 moves it to record 3 an so forth. The other results is NOT duplicated and displays correctly.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply