query joining multiple tables getting duplicates- how to stop

Question

Post reply

query joining multiple tables getting duplicates- how to stop

Denise McMillan

SSCertifiable

Points: 6216
More actions
May 17, 2013 at 9:58 am

#272820

I'm joining several tables and when I add the last one I get duplicate results. How can I get just one for each? Thanks for any ideas.

select a.field, b.field, c.field

from atblname as a inner join btblname as b on a.id = b.parent_id

left outer join ctblname as c on a.id = c.parent_id

There are more than one result when joining tbl a and c, but I'm getting a reult for each of them for all results from joining a and b.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

james.bonnyman SSC Veteran Points: 255 More actions · Answer 1

james.bonnyman

SSC Veteran

Points: 255

May 17, 2013 at 10:24 am

#1616253

Hi!

From a quick look (and unless I have missed something) I think you need only to change SELECT to SELECT DISTINCT in your query to solve the problem.

James

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 2

Denise McMillan (5/17/2013)

I'm joining several tables and when I add the last one I get duplicate results. How can I get just one for each? Thanks for any ideas.

select a.field, b.field, c.field

from atblname as a inner join btblname as b on a.id = b.parent_id

left outer join ctblname as c on a.id = c.parent_id

There are more than one result when joining tbl a and c, but I'm getting a reult for each of them for all results from joining a and b.

What is PK of ctblname? Basically your current join to table ctblname finds more than one record there with the same a.id = c.parent_id, so you got your results "duplicated".

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

Denise McMillan SSCertifiable Points: 6216 More actions · Answer 3

Denise McMillan

SSCertifiable

Points: 6216

May 17, 2013 at 11:06 am

#1616268

I tried using distinct but one of the fields I need is text and so can't use it.

Each table has an id PK only.

I may try making use of a temp table.

Thanks very much for the ideas.

Lynn Pettis SSC Guru Points: 442447 More actions · Answer 4

Lynn Pettis

SSC Guru

Points: 442447

May 17, 2013 at 11:15 am

#1616275

I'm not sure what you mean by duplicate. If there is a one to many relationship between table a and table b (your inner join), then yes, you will get multiple records from table c where there is a match between a.id and c.id.

james.bonnyman SSC Veteran Points: 255 More actions · Answer 5

Hi!

First option, you could change the column type to varchar(max) [text is, from my reading on here, to be depreciated].

As that might be a bit drastic, you can 'type cast' the text column to varchar(max) - this will allow you to do a 'DISTINCT' and by using 'max' should ensure your text is not truncated

Below is a simple example; 'strText' is a text column, the output column from the query will be 'strTextAsVC'

select distinct

CONVERT ( varchar(max), strText ) as strTextAsVC

from

tblTest

James

Denise McMillan SSCertifiable Points: 6216 More actions · Answer 6

I tried using the convert and distinct. It runs but still gives me all the duplicate records. I also tried putting the results of the join on the first 2 tables into a temp table, them joining the 3rd table but it still gave me the duplicates. I don't seeany other field to join on in the 3rd table.

Maybe I need a subquery somehow.

Thanks again for the ideas.

Lynn Pettis SSC Guru Points: 442447 More actions · Answer 7

Denise McMillan (5/17/2013)

I tried using the convert and distinct. It runs but still gives me all the duplicate records. I also tried putting the results of the join on the first 2 tables into a temp table, them joining the 3rd table but it still gave me the duplicates. I don't seeany other field to join on in the 3rd table.

Maybe I need a subquery somehow.

Thanks again for the ideas.

Look at the following:



create table tablea (

    id int,

    col1 varchar(10),

    col2 datetime

);

create table tableb (

    id int,

    col1 varchar(10),

    col2 datetime

);

create table tablec (

    id int,

    col1 varchar(10),

    col2 datetime

);

insert into tablea

values (1,'Row1','2013-05-16 10:00:00'),(2,'Row2','2013-05-16 12:00:00'),(3,'Row3','2013-05-16 14:00:00');

insert into tableb

values (1,'Row1','2013-05-16 10:15:00'),(1,'Row2','2013-05-16 10:30:00'),(2,'Row3','2013-05-16 12:15:00'),(2,'Row4','2013-05-16 12:00:00'),(3,'Row5','2013-05-16 14:00:00');

insert into tablec

values (1,'Row1','2013-05-16 11:00:00'),(3,'Row2','2013-05-16 15:00:00');

select

    a.id as AId,

    a.col1 as ACol1,

    a.col2 as ACol2,

    b.id as BId,

    b.col1 as BCol1,

    b.col2 as BCol2,

    c.id as CId,

    c.col1 as CCol1,

    c.col2 as CCol2

from

    tablea a

    inner join tableb b

        on (a.id = b.id)

    left outer join tablec c

        on (a.id = c.id);

drop table tablea;

drop table tableb;

drop table tablec;

When you run the above you get the following results:



AId         ACol1      ACol2                   BId         BCol1      BCol2                   CId         CCol1      CCol2

----------- ---------- ----------------------- ----------- ---------- ----------------------- ----------- ---------- -----------------------

1           Row1       2013-05-16 10:00:00.000 1           Row1       2013-05-16 10:15:00.000 1           Row1       2013-05-16 11:00:00.000

1           Row1       2013-05-16 10:00:00.000 1           Row2       2013-05-16 10:30:00.000 1           Row1       2013-05-16 11:00:00.000

2           Row2       2013-05-16 12:00:00.000 2           Row3       2013-05-16 12:15:00.000 NULL        NULL       NULL

2           Row2       2013-05-16 12:00:00.000 2           Row4       2013-05-16 12:00:00.000 NULL        NULL       NULL

3           Row3       2013-05-16 14:00:00.000 3           Row5       2013-05-16 14:00:00.000 3           Row2       2013-05-16 15:00:00.000

The data from tablec is "duplicated" for a.id = c.id because there are 2 unique rows of data represented by a.id = b.id.

Is this what you are talking about? What is it you actually want to see?

Denise McMillan SSCertifiable Points: 6216 More actions · Answer 8

Denise McMillan

SSCertifiable

Points: 6216

May 17, 2013 at 12:12 pm

#1616313

I think I can't get what I need because what is in the 3rd table is not related to any one item in the joined a and b tables. It is just once for all.

Thanks very much to everyone who tried to help.

Lynn Pettis SSC Guru Points: 442447 More actions · Answer 9

Denise McMillan (5/17/2013)

I think I can't get what I need because what is in the 3rd table is not related to any one item in the joined a and b tables. It is just once for all.

Thanks very much to everyone who tried to help.

Me thinks you give up too easily.

Is this sort of what you are looking for?



create table tablea (

    id int,

    col1 varchar(10),

    col2 datetime

);

create table tableb (

    id int,

    col1 varchar(10),

    col2 datetime

);

create table tablec (

    id int,

    col1 varchar(10),

    col2 datetime

);

insert into tablea

values (1,'Row1','2013-05-16 10:00:00'),(2,'Row2','2013-05-16 12:00:00'),(3,'Row3','2013-05-16 14:00:00');

insert into tableb

values (1,'Row1','2013-05-16 10:15:00'),(1,'Row2','2013-05-16 10:30:00'),(2,'Row3','2013-05-16 12:15:00'),(2,'Row4','2013-05-16 12:00:00'),(3,'Row5','2013-05-16 14:00:00');

insert into tablec

values (1,'Row1','2013-05-16 11:00:00'),(3,'Row2','2013-05-16 15:00:00');

with TwoTables as (

select

    a.id as AId,

    a.col1 as ACol1,

    a.col2 as ACol2,

    b.id as BId,

    b.col1 as BCol1,

    b.col2 as BCol2,

    rn = row_number() over (partition by a.id order by (select null))

from

    tablea a

    inner join tableb b

        on (a.id = b.id)

)

select

    AId,

    ACol1,

    ACol2,

    BId,

    BCol1,

    BCol2,

    c.id as CId,

    c.col1 as CCol1,

    c.col2 as CCol2

from

    TwoTables tt

    left outer join tablec c

        on (tt.AId = c.id

            and tt.rn = 1);

drop table tablea;

drop table tableb;

drop table tablec;

Results:



AId         ACol1      ACol2                   BId         BCol1      BCol2                   CId         CCol1      CCol2

----------- ---------- ----------------------- ----------- ---------- ----------------------- ----------- ---------- -----------------------

1           Row1       2013-05-16 10:00:00.000 1           Row1       2013-05-16 10:15:00.000 1           Row1       2013-05-16 11:00:00.000

1           Row1       2013-05-16 10:00:00.000 1           Row2       2013-05-16 10:30:00.000 NULL        NULL       NULL

2           Row2       2013-05-16 12:00:00.000 2           Row3       2013-05-16 12:15:00.000 NULL        NULL       NULL

2           Row2       2013-05-16 12:00:00.000 2           Row4       2013-05-16 12:00:00.000 NULL        NULL       NULL

3           Row3       2013-05-16 14:00:00.000 3           Row5       2013-05-16 14:00:00.000 3           Row2       2013-05-16 15:00:00.000

corne660325 SSC Rookie Points: 32 More actions · Answer 10

Hi,

I've tried your axample and it works great except for that i get the following results;

Require;

TableA

TableB [one to many from TableA with an ID from Table A]

TableC [One to many from TableA with an ID from Table A] - Table B & Table C has nothing to do with Each Other

SQL query must show a record from Table A with ALL matching records from Table B as well as ALL matching records from table C.

Currently [before trying your example] i get TableA record BUT Duplicate records from TableB and Duplicate records from TableC [Somehow table B & TableC affects the result.

WITH YOUR EXAMPLE CODE i get TableC 100% correct [No Duplicated values]

TableB however, duplicates on either 1 of the valus returned.

from

TwoTables tt

left outer join QuickSpecMultiple c

on (tt.AId = c.id

and tt.rn = 1);

Changing the tt.rn = 1 to 2, moves the duplication to record 2, to 3 moves it to record 3 an so forth. The other results is NOT duplicated and displays correctly.