Eliminate Duplicate Rows when joining multiple tables

  • We have the below query which is pulling in Sales and Revenue information. Since the sale is recorded in just one month and the revenue is recorded each month, we need to have the results of this query to only list the Sales amount once, but still have all the other revenue amounts listed for each month. In this example, the sale is record in year 2014 and month 10, but there are revenues in every month as well for the rest of 2014 and the start of 2015 but we only want to the sales amount to appear once on this results set. Ideas?

    SELECT

    project.project_number,

    project.country_code,

    project.project_desc,

    gsl.global_service_line_desc,

    buy.buyer_desc,

    project.project_mgr_first_name,

    project.project_mgr_last_name,

    tblWWClient.local_client_name,

    tblWWClient.global_client_code,

    project.local_crm_first_name,

    project.local_crm_last_name,

    project.client_buyer_name,

    project.client_buyer_title,

    projrevenue.fiscal_year as 'Fiscal Year',

    projrevenue.fiscal_period as 'Month',

    project.project_start_date,

    project.project_end_date,

    project.local_project_estimated_fees as USD_Sales,

    SUM(projrevenue.local_consulting_fees + projrevenue.local_product_fees + projrevenue.local_admin_fees + projrevenue.local_misc_fees) as [Revenue],

    project.project_status,

    fd.fiscal_period_end_date

    FROM dbo.tblWorldWide_Clients tblWWClient

    JOIN tblProject project on tblWWClient.local_client_code = project.local_client_code

    JOIN tblProject_Revenue projrevenue on project.project_number = projrevenue.project_number

    JOIN tblGlobal_Service_Line gsl ON project.global_service_line_code = gsl.global_service_line_code

    JOIN tblBuyer buy ON project.buyer_code = buy.buyer_code

    JOIN tblFiscal_Definitions fd ON project.fiscal_year = fd.fiscal_year AND project.fiscal_period = fd.fiscal_period

    WHERE tblWWClient.global_client_code = '90000010' AND ((project.fiscal_year = 2014 AND project.fiscal_period IN (9,10,11,12))

    OR (project.fiscal_year = 2015 AND project.fiscal_period IN (4,5,6,7,8)))

    AND

    ((projrevenue.fiscal_year = 2014 AND projrevenue.fiscal_period IN (9,10,11,12))

    OR (projrevenue.fiscal_year = 2015 AND projrevenue.fiscal_period IN (4,5,6,7,8))

    ) and project.project_number = '1401932001'

    GROUP BY project.project_number,

    project.country_code,

    project.project_desc,

    gsl.global_service_line_desc,

    buy.buyer_desc,

    project.project_mgr_first_name,

    project.project_mgr_last_name,

    tblWWClient.local_client_name,

    tblWWClient.global_client_code,

    project.local_crm_first_name,

    project.local_crm_last_name,

    project.client_buyer_name,

    project.client_buyer_title,

    projrevenue.fiscal_year,

    project.fiscal_year,

    project.fiscal_period,

    projrevenue.fiscal_period,

    project.project_end_date,

    project.project_status,

    project.local_project_estimated_fees,

    project.project_start_date,

    fd.fiscal_period_end_date

    ORDER BY project.project_number, project.fiscal_year, project.fiscal_period

  • If you want a record for each revenue for each month of each year, then I do not see how you can do that without repeating the sale in each row.



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Personally, I think the query is correct. Masking of the subsequent sales amounts should occur in the front end application or report, not at the back end.

  • Reply back with a simplified example of what the query result would look like.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • the attached file has 2 tabs. The first tab is the current result set. The 2nd tab is what we are trying to get to. You will notice that the highlighted cells in yellow are now blank but the sales are being reported once in the first row when the sale really happened.

  • I have an idea. You could use a case statement to set the sales amount to 0.00 if the year and month for the revenue is not the same as the sales.



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • that will work for some of the situations, but not always as there could be situations where there are no revenue reported in the month of the sale

  • I don't see the sale date field and I'm assuming that you want to show the sale information in the fiscal period where the sale occurred. I'm only including the relevant part of the query.

    SELECT

    .

    .

    .

    SUM(CASE WHEN DATEDIFF(MONTH, your_sale_date_field, fp.fiscal_period_end_date) = 0 THEN project.local_project_estimated_fees ELSE NULL END) as USD_Sales,

    .

    .

    .

    GROUP BY

    .

    .

    .

    -- project.local_project_estimated_fees,

    You need to review which fields you are grouping on. You have far too many fields in your group by clause. You should do your grouping as soon as you have all the necessary information. This means that you should do your grouping before doing joins on tables that are not necessary for your aggregate calculations.

    You also don't generally need to group on non-key fields like project_manager_first_name or project_manager_last_name. Take those fields out of the grouping clause and replace the corresponding fields in the SELECT clause with MIN(project_manager_first_name). This will greatly reduce the number of groups that SQL Server has to accommodate and therefore greatly improve the speed of this query.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Lynn Pettis (6/8/2015)


    Personally, I think the query is correct. Masking of the subsequent sales amounts should occur in the front end application or report, not at the back end.

    +1

    It will be far simpler doing this part in SSRS.

    Don Simpson



    I'm not sure about Heisenberg.

  • With nothing to use for testing here is a swag:

    WITH BaseData as (

    SELECT

    rn = ROW_NUMBER() OVER (PARTITION BY

    project.project_number,

    project.country_code,

    project.project_desc,

    gsl.global_service_line_desc,

    buy.buyer_desc,

    project.project_mgr_first_name,

    project.project_mgr_last_name,

    tblWWClient.local_client_name,

    tblWWClient.global_client_code,

    project.local_crm_first_name,

    project.local_crm_last_name,

    project.client_buyer_name,

    project.client_buyer_title,

    projrevenue.fiscal_year,

    project.fiscal_year,

    project.fiscal_period,

    projrevenue.fiscal_period,

    project.project_end_date,

    project.project_status,

    project.local_project_estimated_fees,

    project.project_start_date,

    fd.fiscal_period_end_date

    ORDER BY

    project.project_number,

    project.fiscal_year,

    project.fiscal_period),

    project.project_number,

    project.country_code,

    project.project_desc,

    gsl.global_service_line_desc,

    buy.buyer_desc,

    project.project_mgr_first_name,

    project.project_mgr_last_name,

    tblWWClient.local_client_name,

    tblWWClient.global_client_code,

    project.local_crm_first_name,

    project.local_crm_last_name,

    project.client_buyer_name,

    project.client_buyer_title,

    projrevenue.fiscal_year as 'Fiscal Year',

    projrevenue.fiscal_period as 'Month',

    project.project_start_date,

    project.project_end_date,

    project.local_project_estimated_fees as USD_Sales,

    SUM(projrevenue.local_consulting_fees + projrevenue.local_product_fees + projrevenue.local_admin_fees + projrevenue.local_misc_fees) as [Revenue],

    project.project_status,

    fd.fiscal_period_end_date

    FROM

    dbo.tblWorldWide_Clients tblWWClient

    INNER JOIN tblProject project

    on tblWWClient.local_client_code = project.local_client_code

    INNER JOIN tblProject_Revenue projrevenue

    on project.project_number = projrevenue.project_number

    INNER JOIN tblGlobal_Service_Line gsl

    ON project.global_service_line_code = gsl.global_service_line_code

    INNER JOIN tblBuyer buy

    ON project.buyer_code = buy.buyer_code

    INNER JOIN tblFiscal_Definitions fd

    ON project.fiscal_year = fd.fiscal_year AND

    project.fiscal_period = fd.fiscal_period

    WHERE

    tblWWClient.global_client_code = '90000010' AND

    ((project.fiscal_year = 2014 AND project.fiscal_period IN (9,10,11,12)) OR

    (project.fiscal_year = 2015 AND project.fiscal_period IN (4,5,6,7,8))) AND

    ((projrevenue.fiscal_year = 2014 AND projrevenue.fiscal_period IN (9,10,11,12)) OR

    (projrevenue.fiscal_year = 2015 AND projrevenue.fiscal_period IN (4,5,6,7,8))) AND

    project.project_number = '1401932001'

    GROUP BY

    project.project_number,

    project.country_code,

    project.project_desc,

    gsl.global_service_line_desc,

    buy.buyer_desc,

    project.project_mgr_first_name,

    project.project_mgr_last_name,

    tblWWClient.local_client_name,

    tblWWClient.global_client_code,

    project.local_crm_first_name,

    project.local_crm_last_name,

    project.client_buyer_name,

    project.client_buyer_title,

    projrevenue.fiscal_year,

    project.fiscal_year,

    project.fiscal_period,

    projrevenue.fiscal_period,

    project.project_end_date,

    project.project_status,

    project.local_project_estimated_fees,

    project.project_start_date,

    fd.fiscal_period_end_date

    )

    SELECT

    project_number,

    country_code,

    project_desc,

    global_service_line_desc,

    buyer_desc,

    project_mgr_first_name,

    project_mgr_last_name,

    local_client_name,

    global_client_code,

    local_crm_first_name,

    local_crm_last_name,

    client_buyer_name,

    client_buyer_title,

    [Fiscal Year],

    [Month],

    project_start_date,

    project_end_date,

    USD_Sales,

    CASE WHEN RN = 1 THEN CAST([Revenue] as varchar(20)) ELSE '' END as [Revenue],

    project_status,

    fiscal_period_end_date

    FROM

    BaseData

    ORDER BY

    project_number,

    fiscal_year,

    fiscal_period;

    Please note, I'd rather leave the formatting of the output to the front end instead of doing here in the query.

  • Lynn Pettis (6/8/2015)


    ...

    Please note, I'd rather leave the formatting of the output to the front end instead of doing here in the query.

    + 1



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • DonlSimpson (6/8/2015)


    Lynn Pettis (6/8/2015)


    Personally, I think the query is correct. Masking of the subsequent sales amounts should occur in the front end application or report, not at the back end.

    +1

    It will be far simpler doing this part in SSRS.

    Ideally, we could just convince the user to ignore the repeated sales amounts.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • I greatly simplified your query by doing the aggregates as soon as I had all the necessary information. It turns out that the tblProjectRevenue contained enough information to do the aggregates before joining to any tables. I also included a way to account for sales date that don't have a corresponding revenue period. If that solution doesn't work, then using the ROW_NUMBER as others have suggested is the best alternative. There were a couple of places where I didn't have enough information to give you a complete query. These all have to do with the date of the sale.

    ;

    WITH Project_Revenue AS (

    SELECT pr.project_number, pr.fiscal_year, pr.fiscal_period, pr.local_consulting_fees + pr.local_product_fees + pr.local_admin_fees + pr.local_misc_fees as [Revenue],

    FROM tblProject_Revenue AS pr

    UNION

    SELECT proj.project_number, <SALE DATE FISCAL YEAR CALCULATION>, <SALE DATE FISCAL PERIOD CALCULATION>, 0

    FROM tblProject AS proj

    )

    , Project_Revenue_Totals AS (

    SELECT pr.project_number, pr.fiscal_year, pr.fiscal_period, SUM(Revenue) AS Revenue

    FROM Project_Revenue AS pr

    GROUP BY pr.project_number, pr.fiscal_year, pr.fiscal_period

    )

    SELECT

    project.project_number,

    project.country_code,

    project.project_desc,

    gsl.global_service_line_desc,

    buy.buyer_desc,

    project.project_mgr_first_name,

    project.project_mgr_last_name,

    tblWWClient.local_client_name,

    tblWWClient.global_client_code,

    project.local_crm_first_name,

    project.local_crm_last_name,

    project.client_buyer_name,

    project.client_buyer_title,

    projrevenue.fiscal_year as 'Fiscal Year',

    projrevenue.fiscal_period as 'Month',

    project.project_start_date,

    project.project_end_date,

    CASE WHEN <sale date matches project revenue fiscal period> THEN project.local_project_estimated_fees ELSE NULL END as USD_Sales,

    projrevenue.Revenue,

    project.project_status,

    fd.fiscal_period_end_date

    FROM dbo.tblWorldWide_Clients tblWWClient

    JOIN tblProject project on tblWWClient.local_client_code = project.local_client_code

    JOIN Project_Revenue_Totals projrevenue on project.project_number = projrevenue.project_number

    JOIN tblGlobal_Service_Line gsl ON project.global_service_line_code = gsl.global_service_line_code

    JOIN tblBuyer buy ON project.buyer_code = buy.buyer_code

    JOIN tblFiscal_Definitions fd ON project.fiscal_year = fd.fiscal_year AND project.fiscal_period = fd.fiscal_period

    WHERE tblWWClient.global_client_code = '90000010' AND ((project.fiscal_year = 2014 AND project.fiscal_period IN (9,10,11,12))

    OR (project.fiscal_year = 2015 AND project.fiscal_period IN (4,5,6,7,8)))

    AND

    ((projrevenue.fiscal_year = 2014 AND projrevenue.fiscal_period IN (9,10,11,12))

    OR (projrevenue.fiscal_year = 2015 AND projrevenue.fiscal_period IN (4,5,6,7,8))

    ) and project.project_number = '1401932001'

    ORDER BY project.project_number, project.fiscal_year, project.fiscal_period

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • thanks for the syntax, but when I run this in our environment, I still have the sales amount repeated for all the months where there is revenue.

  • tstagliano (6/9/2015)


    thanks for the syntax, but when I run this in our environment, I still have the sales amount repeated for all the months where there is revenue.

    You've probably set up your CASE statement for the sales incorrectly. You haven't provided enough information to determine what the correct syntax should be.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply