Group By clause

  • I'm trying to group by a field named Vehiclename but also need all the other fields listed in my select statement below. I understand about the aggregate error message but when I include all these fields in the Group By clause everything gets all mixed up. What is the best way to get the desired results.

    SELECT Vehiclename, Staffname, Groupname, odometer, speed_hor, Timedeltamin, info_time

    FROM gis.avl_history_mups

    GROUP BY Vehiclename

    ORDER BY info_time;

    Thanks,

    Nelson

  • newbie reply.....

    My understanding is that GROUP BY clauses work in conjunction with aggregate queries, such as SUM, AVG, etc.

    For example, if you have a list of cars and want to see how many "Hondas", "Toyotas" or "Fords" there are in your car list, you would group by "make", ie

    select make, count(make)

    from your_table

    group by make

    You're missing a value around which to group by in your query.

    Hope this helps....

  • nelson.ahrens (3/6/2009)


    I'm trying to group by a field named Vehiclename but also need all the other fields listed in my select statement below. I understand about the aggregate error message but when I include all these fields in the Group By clause everything gets all mixed up. What is the best way to get the desired results.

    SELECT Vehiclename, Staffname, Groupname, odometer, speed_hor, Timedeltamin, info_time

    FROM gis.avl_history_mups

    GROUP BY Vehiclename

    ORDER BY info_time;

    Thanks,

    Nelson

    As already stated, you have no aggregations in the SELECT list... why do you believe you need a GROUP BY in this?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Nelson, could you please be specific about "the desired results?" We really have no idea of what you are trying to accomplish. It is most helpful if you can set up the problem as suggested in the link under Jeff's message. You can get your answer MUCH quicker if your question says:

    1. I have table(s) that look like this.

    (example)

    2. How do I write my query so that results look like this.

    (example)

    Thanks 🙂

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • I have colleagues who use GROUP BY clauses without aggregation in place of a SELECT DISTINCT. The results appear to be the same. I've always used a DISTINCT if I don't need aggregation.

  • Nelson,

    Do you understand the difference between GROUP BY & ORDER BY? GROUP BY is like taking a bowl of fruit and putting the grapes with the grapes and the apples with the apples, etc. ORDER BY is the order in which the final presentation is handed to you.

    Say you have yellow apples, red apples, green grapes and red grapes all in your bowl. You separate out the grapes, putting both colors together, and the apples, again putting both colors together, and then say "Order by color". The end result is that you suddenly have green grapes, red apples, red grapes, and yellow apples in that order. Which is not what you seem to be expecting.

    Take another look at your query. First you have a syntax problem. You can't use a GROUP BY statement with only one non-aggregated value in the statement. SQL Server requires that you have ALL non-aggregated values in the GROUP BY statement. GROUP BY is not a single field statement unless you have no other non-aggregated fields.

    So

    SELECT Vehiclename, Staffname, Groupname, odometer, speed_hor,

    Timedeltamin, info_time

    FROM gis.avl_history_mups

    GROUP BY Vehiclename

    ORDER BY info_time;

    Has to actually be:

    SELECT Vehiclename, Staffname, Groupname, odometer, speed_hor,

    Timedeltamin, info_time

    FROM gis.avl_history_mups

    GROUP BY Vehiclename, Staffname, Groupname, odometer,

    speed_hor, Timedeltamin, info_time

    ORDER BY info_time;

    And you do the GROUP BY in the order of Groupings you want. So, if you want VehicleName to be the primary grouping (or fruitname in the instance above), you'd list that first and then everything else in the order that you want grouped. (IE, "Group by Fruitname, Color")

    The second problem, as others have mentioned, is you have no aggregate (Min, Max, Avg, Count, Sum, etc.) in your SELECT query. Which you may or may not want. If you don't want an aggregate, you certainly don't need one. However, if you don't have an aggregate, you really don't need the GROUP BY statement either. An ORDER BY would work just as well.

    The last problem is that you're ordering your list by info_time (or color in the above example), which is overriding your grouping. Remember, the order in which each clause is listed is important. ORDER BY superceeds GROUP BY. HAVING superceeds WHERE. And your "ON"s for the JOIN statements can superceed each other depending on which order you have them listed because they are all filters of a sort.

    If you want to do a non-aggregate query, but you want items grouped together, you can do one of three things.

    1) Use a GROUP BY without an ORDER BY (works especially well with an aggregate)

    2) Use an ORDER BY without a GROUP BY

    3) Use both, but make sure the ORDER BY is nearly identical to the GROUP BY

    To further explain #3, you don't have to have the whole SELECT list in the ORDER BY, just the items you want to order. Example:

    SELECT Vehiclename, Staffname, Groupname, odometer, speed_hor,

    Timedeltamin, info_time

    FROM gis.avl_history_mups

    GROUP BY Vehiclename, Staffname, Groupname, odometer, speed_hor,

    Timedeltamin, info_time

    ORDER BY Vehiclename, info_time;

    Hope this helps.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • I have colleagues who use GROUP BY clauses without aggregation in place of a SELECT DISTINCT. The results appear to be the same. I've always used a DISTINCT if I don't need aggregation.

    That came up in another thread recently. The execution plans are the same.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Bob Hovious (3/9/2009)


    I have colleagues who use GROUP BY clauses without aggregation in place of a SELECT DISTINCT. The results appear to be the same. I've always used a DISTINCT if I don't need aggregation.

    That came up in another thread recently. The execution plans are the same.

    The execution plans might be the same, but the results might not always be. Group By doesn't guarantee distinct results.

    Depends on the query, the tables / joins, and records involved.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Brandie Tarvin (3/9/2009)


    Bob Hovious (3/9/2009)


    I have colleagues who use GROUP BY clauses without aggregation in place of a SELECT DISTINCT. The results appear to be the same. I've always used a DISTINCT if I don't need aggregation.

    That came up in another thread recently. The execution plans are the same.

    The execution plans might be the same, but the results might not always be. Group By doesn't guarantee distinct results.

    Depends on the query, the tables / joins, and records involved.

    Why wouldn't it guarantee uniqueness? You end up with distinct combinations of each item in the GROUP BY list, which is essentially what the DISTINCT will do. What am I missing?

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Matt Miller (3/9/2009)

    Why wouldn't it guarantee uniqueness? You end up with distinct combinations of each item in the GROUP BY list, which is essentially what the DISTINCT will do. What am I missing?

    I actually had an example of this last month here at my workplace. Let me see if I can find it. But ISTR that we ended up with some duplicate records due to a JOIN's ON clause. I had to add Distinct in to guarantee the uniqueness, which annoyed me to no end as I was trying to avoid using Distinct.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Aha! Remembered what it was. This is a poor example because I can't give you my real data and I had a much more complex query. Suffice to say, though, that Table B has multiple variations on a product that can all be valid for a single record in Table A

    Table A has StoreID, StoreName, DistrictNumber.

    Table B has StoreID, ProductNumber, ProductName, Quantity

    TableA

    StoreID DistrictNumber

    1 125

    2 234

    3 867

    TableB

    StoreID ProductNumber Quantity

    1 AB752 10 per

    1 CD257 5 per

    1 AB752 25 per

    Select a.StoreID, a.DistrictNumber, b.ProductNumber

    from TableA a

    join TableB b

    on a.StoreID = b.StoreID

    Group By a.StoreID, a.DistrictNumber, b.ProductNumber

    will give a distinct record set. However, the below (and it does sometimes happen where grouping happens on something not part of the Select list) will NOT give a distinct record set.

    Select a.StoreID, a.DistrictNumber, b.ProductNumber

    from TableA a

    join TableB b

    on a.StoreID = b.StoreID

    Group By a.StoreID, a.DistrictNumber, b.ProductNumber, b.Quantity

    See the addition? Suddenly I have 3 records for Store 1 when I only want 2. Granted, it's weird to have a field in a GROUP BY clause that you don't have in a SELECT, but it has happened to me in the past. Also, if I have a join that goes many to many, or if something in the ON statement is hinky, it'll give me duplicates even though the recordset I'm pulling shouldn't include certain unique values.

    Yes, this is a strange case that shouldn't happen. But I've actually seen it happen too many times (with other DBAs besides me) to completely agree that GROUP BY will always give a distinct record set.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • All right - if you don't make the SELECT match the GROUP BY, then sure, you will end up with multiple instances of the same combination in the the SELECT area. I didn't think of that one.....

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • OK, that's true, Brandie. But the solution isn't to add DISTINCT like this:

    Select DISTINCT a.StoreID, a.DistrictNumber, b.ProductNumber

    from TableA a

    join TableB b

    on a.StoreID = b.StoreID

    Group By a.StoreID, a.DistrictNumber, b.ProductNumber, b.Quantity

    Which just adds extra overhead.

    The solution is to fix the Group BY:

    Select DISTINCT a.StoreID, a.DistrictNumber, b.ProductNumber

    from TableA a

    join TableB b

    on a.StoreID = b.StoreID

    Group By a.StoreID, a.DistrictNumber, b.ProductNumber -- , b.Quantity

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • RBarry,

    Depends on the situation. You can't always fix the Group By. Sometimes it screws up your result set if you do.

    Believe me, I've tried that. Sometimes fixing the Group By works. Sometimes it doesn't. Depends on what end result you're attempting to achieve and where other columns might be hiding that you wouldn't normally put in the SELECT.

    Oh, just thought of a good example!

    Case statements. You do a "CASE MyStatus When 'A' Then 1 When 'C' Then 0 When 'Z' Then 0 End as StatusCount".

    You HAVE to group by MyStatus even though it's nowhere else within your SELECT statement. Suddenly, if you have a status history table you're connecting to, you get duplicated results. Therefore, you have to use DISTINCT instead of "fixing" the GROUP BY.

    EDIT: Another example uses LEFT OUTER JOINS and dates. You could come up with a NULL date and a real date for the same order shipping detail. Suddenly you have two records, even though you're not using the date in the SELECT statement.

    The list goes on and on for possible reasons why removing that "extra" column in the GROUP BY won't work.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Brandie Tarvin (3/9/2009)


    RBarry,

    Depends on the situation. You can't always fix the Group By. Sometimes it screws up your result set if you do.

    Believe me, I've tried that. Sometimes fixing the Group By works. Sometimes it doesn't. Depends on what end result you're attempting to achieve and where other columns might be hiding that you wouldn't normally put in the SELECT.

    Oh, just thought of a good example!

    Case statements. You do a "CASE MyStatus When 'A' Then 1 When 'C' Then 0 When 'Z' Then 0 End as StatusCount".

    You HAVE to group by MyStatus even though it's nowhere else within your SELECT statement. Suddenly, if you have a status history table you're connecting to, you get duplicated results. Therefore, you have to use DISTINCT instead of "fixing" the GROUP BY.

    EDIT: Another example uses LEFT OUTER JOINS and dates. You could come up with a NULL date and a real date for the same order shipping detail. Suddenly you have two records, even though you're not using the date in the SELECT statement.

    The list goes on and on for possible reasons why removing that "extra" column in the GROUP BY won't work.

    Post an example, table DDL, sample data, expected results. Best way to determine this in the end.

Viewing 15 posts - 1 through 15 (of 28 total)

You must be logged in to reply to this topic. Login to reply