Avoid grouping twice on a query

  • I got a problem doing a multiple join to retrieve some data...

    Tables look like:

    Employees

    EmpID Name

    1 ANDREW

    2 STEPHEN

    3 DAVID

    Companies

    ComID Name

    1 Company1

    2 Company2

    Payments

    PayID Date

    1 2016-02-25

    UserPayments

    RecID PayID ComID EmpID WeekNo Amt1 Amt2

    1 1 1 1 1 20.5 20.5

    2 1 1 1 2 20.5 20.5

    3 1 1 1 3 20.5 20.5

    4 1 1 1 4 20.5 20.5

    5 1 2 2 1 20.5 20.5

    6 1 2 2 2 20.5 20.5

    7 1 2 2 3 20.5 20.5

    8 1 2 2 4 20.5 20.5

    9 1 2 3 1 20.5 20.5

    10 1 2 3 2 20.5 20.5

    11 1 2 3 3 20.5 20.5

    12 1 2 3 4 20.5 20.5

    Result

    PayID EmpName UserName Amt1Total Amt2Total

    1 Company1 ANDREW 82 82

    1 Company2 STEPHEN 82 82

    1 Company2 DAVID 82 82

    So my problem is that my query need to filter twice and group within a subquery and then use DISTINCT not to show every record x4...

    I thought I would be an easy task but so far I've tried several hours and still not seeing an alternative query for doing this...

    Here's what I got now.

    SELECT DISTINCT

    p.PayID,

    e.Name,

    c.Name,

    xx.a,

    xx.b

    FROM

    UserPayments p

    LEFT JOIN

    Employees e

    ON p.EmpID = e.EmpID

    LEFT JOIN

    Companies c

    ON p.ComID = c.ComID

    LEFT JOIN

    (SELECT

    p.EmpID,

    SUM(p.Amt1) a,

    SUM(p.Amt2) b

    FROM

    UserPayments p

    WHERE p.PayID = 1

    GROUP BY p.EmpID

    ) xx

    ON p.EmpID = xx.EmpID

    WHERE p.PayID = 1

    Whis query works but its just the nature of using DISTINCT and having the aggregate SUMs on a subquery join... I don't know... Is it the most effective way of doing this?

    Regards


    My stack: VS2015 ~ C# ~ MSSQL 2014[/url] ~ DevCraft ~ ReSharper ~ DataGrip ~ Linqer[/url]

  • After adding your sample data into some temporary tables, this returned the same results as above:

    SELECT p.PayID,

    e.Name,

    c.Name,

    SUM(p.Amt1) SumAmt1,

    SUM(p.Amt2) SumAmt2

    FROM #UserPayments p

    INNER JOIN #Employees e ON p.EmpID = e.EmpID

    INNER JOIN #Companies c ON p.ComID = c.CID

    WHERE p.PayID = 1

    GROUP BY p.PayID,

    e.Name,

    c.Name

    It doesn't look like you need a subquery.

    Note: you may need to change the 'joins' as I've used INNER above - is LEFT necessary?

  • tindog (2/25/2016)


    After adding your sample data into some temporary tables, this returned the same results as above:

    SELECT p.PayID,

    e.Name,

    c.Name,

    SUM(p.Amt1) SumAmt1,

    SUM(p.Amt2) SumAmt2

    FROM #UserPayments p

    INNER JOIN #Employees e ON p.EmpID = e.EmpID

    INNER JOIN #Companies c ON p.ComID = c.CID

    WHERE p.PayID = 1

    GROUP BY p.PayID,

    e.Name,

    c.Name

    It doesn't look like you need a subquery.

    Note: you may need to change the 'joins' as I've used INNER above - is LEFT necessary?

    Hey there, thanks for your reply...

    I did that originally, but having so many fields on the GROUP BY clause makes it look even dodgier....

    Look (with the original table and field names (same thing though):

    SELECT

    tcs.CRSEntryNo,

    tp.PersonNo,

    tp.PenNo,

    tp.LName,

    tp.FName,

    te.EmployerName,

    tcs.DocType,

    SUM(tcs.EmployeeValue) AS Employee,

    SUM(tcs.EmployerValue) AS Employer

    FROM

    TblCrsSub tcs

    INNER JOIN

    TblPersonal tp ON tcs.PenNo = tp.PenNo

    INNER JOIN

    TblEmployer te ON tcs.Ref = te.EmployerCode

    WHERE

    tcs.CRSEntryNo = 4045

    GROUP BY

    tcs.CRSEntryNo,

    tp.PersonNo,

    tp.PenNo,

    tp.LName,

    tp.FName,

    te.EmployerName,

    tcs.DocType

    But whats even worst, if I do something like:

    SELECT

    CRSENtryNo,

    SUM(EmployeeValue) AS Tee,

    SUM(EmployerValue) AS Ter

    FROM TblCrsSub

    WHERE CRSEntryNo = 4045

    GROUP BY CRSEntryNo

    I get different values for both aggregated functions (?) I'm currently going through the records to see why the difference is happening


    My stack: VS2015 ~ C# ~ MSSQL 2014[/url] ~ DevCraft ~ ReSharper ~ DataGrip ~ Linqer[/url]

  • What do you mean by 'looks dodgier'? Does it not return the correct data?

    What's the issue with the first query in your recent post? Also, the second query can't be right, it's selecting everything and grouping by nothing.

  • tindog (2/25/2016)


    What do you mean by 'looks dodgier'? Does it not return the correct data?

    What's the issue with the first query in your recent post? Also, the second query can't be right, it's selecting everything and grouping by nothing.

    Its nothing, just the way it looks I think, GROUP BY everything, you know... Maybe i'm just biased since thats the way queries look like when you use SSMS query builder...

    And yes, they do produce different total amounts, I fixed the second query now...


    My stack: VS2015 ~ C# ~ MSSQL 2014[/url] ~ DevCraft ~ ReSharper ~ DataGrip ~ Linqer[/url]

  • Dodgier according to who?

    If your problem is that you have too many columns in the GROUP BY clause, you could use a different approach.

    WITH cteCrsSub AS(

    SELECT tcs.CRSEntryNo,

    tcs.PenNo,

    tcs.Ref,

    SUM(tcs.EmployeeValue) AS Employee,

    SUM(tcs.EmployerValue) AS Employer

    FROM TblCrsSub tcs

    WHERE

    tcs.CRSEntryNo = 4045

    GROUP BY tcs.CRSEntryNo,

    tcs.PenNo,

    tcs.Ref

    )

    SELECT

    tcs.CRSEntryNo,

    tp.PersonNo,

    tp.PenNo,

    tp.LName,

    tp.FName,

    te.EmployerName,

    tcs.DocType,

    tcs.Employee,

    tcs.Employer

    FROM cteCrsSub tcs

    JOIN TblPersonal tp ON tcs.PenNo = tp.PenNo

    JOIN TblEmployer te ON tcs.Ref = te.EmployerCode;

    I used a CTE because I find derived tables dodgy. 😀

    Your initial problem was that you were including the table twice, once aggregated and once as it is.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • sys.user (2/25/2016)


    tindog (2/25/2016)


    What do you mean by 'looks dodgier'? Does it not return the correct data?

    What's the issue with the first query in your recent post? Also, the second query can't be right, it's selecting everything and grouping by nothing.

    Its nothing, just the way it looks I think, GROUP BY everything, you know... Maybe i'm just biased since thats the way queries look like when you use SSMS query builder...

    And yes, they do produce different total amounts, I fixed the second query now...

    Is making a query look pretty really worth your time and possible performance sacrifices? As long as it's readable and performs well, you're all set.

    Can you provide sample data for those tables - they look to be different to the tables in your original post.

  • Luis Cazares (2/25/2016)


    Dodgier according to who?

    If your problem is that you have too many columns in the GROUP BY clause, you could use a different approach.

    I used a CTE because I find derived tables dodgy. 😀

    Your initial problem was that you were including the table twice, once aggregated and once as it is.

    tindog (2/25/2016)


    Is making a query look pretty really worth your time and possible performance sacrifices? As long as it's readable and performs well, you're all set.

    Can you provide sample data for those tables - they look to be different to the tables in your original post.

    I'm fully aware of how esoterical it sounds when I evaluate a query by the way it looks, but you know it was one of those moments... If its right or normal in this case to group by a bunch of fields why wouldnt it be better to just stick to use DISTINCT on all the values but the total and get the total from an aggregate subquery?


    My stack: VS2015 ~ C# ~ MSSQL 2014[/url] ~ DevCraft ~ ReSharper ~ DataGrip ~ Linqer[/url]

  • sys.user (2/25/2016)


    Luis Cazares (2/25/2016)


    Dodgier according to who?

    If your problem is that you have too many columns in the GROUP BY clause, you could use a different approach.

    I used a CTE because I find derived tables dodgy. 😀

    Your initial problem was that you were including the table twice, once aggregated and once as it is.

    tindog (2/25/2016)


    Is making a query look pretty really worth your time and possible performance sacrifices? As long as it's readable and performs well, you're all set.

    Can you provide sample data for those tables - they look to be different to the tables in your original post.

    I'm fully aware of how esoterical it sounds when I evaluate a query by the way it looks, but you know it was one of those moments... If its right or normal in this case to group by a bunch of fields why wouldnt it be better to just stick to use DISTINCT on all the values but the total and get the total from an aggregate subquery?

    Because if you query the table twice unnecessarily, performance will decrease.

  • sys.user (2/25/2016)


    Luis Cazares (2/25/2016)


    Dodgier according to who?

    If your problem is that you have too many columns in the GROUP BY clause, you could use a different approach.

    I used a CTE because I find derived tables dodgy. 😀

    Your initial problem was that you were including the table twice, once aggregated and once as it is.

    tindog (2/25/2016)


    Is making a query look pretty really worth your time and possible performance sacrifices? As long as it's readable and performs well, you're all set.

    Can you provide sample data for those tables - they look to be different to the tables in your original post.

    I'm fully aware of how esoterical it sounds when I evaluate a query by the way it looks, but you know it was one of those moments... If its right or normal in this case to group by a bunch of fields why wouldnt it be better to just stick to use DISTINCT on all the values but the total and get the total from an aggregate subquery?

    Because DISTINCT is actually a common indicator of a problem, people use it to mask problems caused by incorrect joins. You were reading the table twice and then discarded the unnecessary duplicates caused by the additional table read.

    As I showed, the subquery wasn't the problem and you could use it without problems, I just prefer to use it in the form of a CTE which leaves the main query "clean".

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Thats alright then... Thanks both for your help.


    My stack: VS2015 ~ C# ~ MSSQL 2014[/url] ~ DevCraft ~ ReSharper ~ DataGrip ~ Linqer[/url]

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply