Eliminate Duplicate Rows when joining multiple tables

  • tstagliano (6/8/2015)


    We have the below query which is pulling in Sales and Revenue information. Since the sale is recorded in just one month and the revenue is recorded each month, we need to have the results of this query to only list the Sales amount once, but still have all the other revenue amounts listed for each month. In this example, the sale is record in year 2014 and month 10, but there are revenues in every month as well for the rest of 2014 and the start of 2015 but we only want to the sales amount to appear once on this results set. Ideas?

    ...

    Are you missing a couple of join predicates?

    JOIN tblProject project

    on tblWWClient.local_client_code = project.local_client_code

    JOIN tblProject_Revenue projrevenue

    on project.project_number = projrevenue.project_number

    AND projrevenue.fiscal_year = project.fiscal_year -- missing

    AND projrevenue.fiscal_period = project.fiscal_period -- missing

    JOIN tblGlobal_Service_Line gsl

    ON project.global_service_line_code = gsl.global_service_line_code

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • I'm assuming the goal here is to have USD_Sales populated only on the lowest ranked row within a group.

    I've found the simplest, most flexible, and often the most performant approach to this is to start with the most simple version of query, select that into a a temp table, and then follow that with something like a self joining update to set USD_Sales = NULL for those rows that arn't ranked lowest, highest, or whatever.

    select ...

    into #T

    from ...;

    update T

    set T.USD_Sales = NULL

    from #T as T

    join

    (

    select project_number, fiscal_year, month

    , dense_rank() over (parition by project_number

    order by fiscal_year asc, month asc)rank_order

    from #T

    ) X on X.project_number = T.project_number

    and X.fiscal_year = T.fiscal_year

    and X.month = T.month

    and X.rank_order = 1;

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • the temp table was the route I took to get the results. Thanks everyone for you help.

Viewing 3 posts - 16 through 17 (of 17 total)

You must be logged in to reply to this topic. Login to reply