Home Forums SQL Server 7,2000 T-SQL eliminating duplicates from stored procedure joins RE: eliminating duplicates from stored procedure joins

  • You may be pleased (kind of!) to know that the reason you have had so much trouble is that what you are trying to do is logically impossible.

    You don't have a fan trap - but you do have a standard chain of three tables.

    You need to be aware of which table is at the 'top' of the chain.

    This is the Case table.

    The chain goes from one to many:

    .CASE

    .

    .associated with many

    .

    ...REPORTS

    ...

    ...viewed in many

    ...

    ......USERVIEWINGS.

    Here is an example assuming only one 'case' table, and using inner joins for simplicity:

    The tables:

    CASE:

    id

    --

    .1

    .2

    REPORT:

    id....CaseID

    --....------

    .1.........1

    .2.........1

    .3.........2

    .4.........2

    USERVIEWING:

    id....RepID

    --....-----

    .1........1

    .2........1

    .3........2

    .4........2

    .5........3

    .6........3

    .7........4

    .8........4

    The SELECT statement:

    select

    C.id as Cid,

    R.id as Rid,

    U.id as Uid

    from CASE C

    join REPORT R

    on R.CaseID = C.id

    join USERVIEWING U

    on U.RepID = R.id

    The results:

    Cid Rid Uid

    ..1...1...1

    ..1...1...2

    ..1...2...3

    ..1...2...4

    ..2...3...5

    ..2...3...6

    ..2...4...7

    ..2...4...8

    As you can see, we can't return fewer records than the number of matching records in the bottom-most table (USERVIEWING.)

    In your case, you are using outer joins, so you are likely to return even more records (i.e. even unmatched records from the higher tables.)

    This is unavoidable.

    If you have a one-to-many relationship between two tables, but you only want one record per record from the topmost (one) table, you have to decide what do do with the extra information from the 'child' (many) tables.

    Often you will use an aggregate function like 'SUM', 'COUNT' or MAX to combine many values into one.

    Alternatively you might do something more comlpicated, and either create a crosstab (so that by using extra columns you can show the multiple values on one line), or you might combine all the values into a single field, separated by commas, say.

    The last two are more complex, and I wouldn't recommend either if you can avoid it.

    It might be useful to hear from you what the query is being used for, as I think we need to understand your requuirements in order to decide what you need to do.

    Tim Wilkinson

    "If it doesn't work in practice, you're using the wrong theory"
    - Immanuel Kant