Using Subqueries as Calculated Fields

  • I have a concept question on this code:

    SELECT cust_name,

    cust_state,

    (SELECT Count(*)

    FROM orders

    WHERE order.cust_id=customers.cust_id) as orders

    FROM customers

    ORDER BY cust_name;

    I understand how the Count function works with it counting the rows that are matching with regards to customers.cust_id with orders.customers_id. What concern me if what happens if there are many cust_names that are the same. How does SQL serve know to match up which cust_name with the correct cust_id in order to have the count function match the correct count with the customer_id?

  • SQL 2016 (3/29/2016)


    I have a concept question on this code:

    SELECT cust_name,

    cust_state,

    (SELECT Count(*)

    FROM orders

    WHERE order.cust_id=customers.cust_id) as orders

    FROM customers

    ORDER BY cust_name;

    I understand how the Count function works with it counting the rows that are matching with regards to customers.cust_id with orders.customers_id. What concern me if what happens if there are many cust_names that are the same. How does SQL serve know to match up which cust_name with the correct cust_id in order to have the count function match the correct count with the customer_id?

    It doesn't. It knows to connect them based on cust_id, because you tell it to. Presumably, cust_id is the primary key in the customers table, so it uniquely identifies only one customer. It doesn't matter how many customers have the same name, because it's not using the name field to make the connection.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Got it thanks for the insights : )

  • Quick thought, there is no need for the subquery as this can written as a normal query

    😎

    SELECT

    C.cust_id

    ,C.cust_name

    ,C.cust_state

    ,COUNT(*) AS ORDERS

    FROM customers C

    LEFT OUTER JOIN [order] O

    ON C.cust_id = O.cust_id

    GROUP BY C.cust_id

    ,C.cust_name

    ,C.cust_state

    Edit: missing C.cust_id in group by

  • Eirikur Eiriksson (3/29/2016)


    Quick thought, there is no need for the subquery as this can written as a normal query

    😎

    SELECT

    C.cust_id

    ,C.cust_name

    ,C.cust_state

    ,COUNT(*) AS ORDERS

    FROM customers C

    LEFT OUTER JOIN [order] O

    ON C.cust_id = O.cust_id

    GROUP BY C.cust_name

    ,C.cust_state

    First, that will fail, because you didn't include C.cust_id in the GROUP BY or an aggregate function. Secondly, it will probably be more efficient to group the orders before performing the join, especially if there is an index on the orders table that starts with custid.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • drew.allen (3/29/2016)


    Secondly, it will probably be more efficient to group the orders before performing the join, especially if there is an index on the orders table that starts with custid.

    If C.Cust_ID is unique, then the optimiser knows it only has to aggregate the Orders table by the joining column, and the two will likely be the same in terms of performance.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • drew.allen (3/29/2016)


    Eirikur Eiriksson (3/29/2016)


    Quick thought, there is no need for the subquery as this can written as a normal query

    😎

    SELECT

    C.cust_id

    ,C.cust_name

    ,C.cust_state

    ,COUNT(*) AS ORDERS

    FROM customers C

    LEFT OUTER JOIN [order] O

    ON C.cust_id = O.cust_id

    GROUP BY C.cust_name

    ,C.cust_state

    First, that will fail, because you didn't include C.cust_id in the GROUP BY or an aggregate function. Secondly, it will probably be more efficient to group the orders before performing the join, especially if there is an index on the orders table that starts with custid.

    Drew

    thanks, missed the cost_id column.

    😎

    Pretty certain that the performance is almost identical if the right indices are in place, certainly will if the cust_id column is unique.

  • I probably think this query would perform better.

    SELECT cust_name,

    cust_state,

    (SELECT Count(*)

    FROM orders

    WHERE order.cust_id=customers.cust_id) as orders

    FROM customers

    ORDER BY cust_name;

    Because the aggregate function would be executed before the physical join on customer table takes place.Means a distinct count along with customerid from the Orders table would be returned before being joined to the Customers table.This would drastically reduce the no of rows for the outer input of the join.

    While the other query will have to first do a complete join on all the possible rows from the two tables and then sort this output(in case optimizer decides to use a stream aggregate) to perform an aggregate function to return the count value.

    --------------------------------------------------------------------------------------------------
    I am just an another naive wannabe DBA trying to learn SQL Server

  • Frankly, I don't understand why anyone would be interested to compare performance between two queries that do not return the same results. At most one of them can be correct. I don't care about performance of incorrect queries.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Hugo Kornelis (3/30/2016)


    Frankly, I don't understand why anyone would be interested to compare performance between two queries that do not return the same results. At most one of them can be correct. I don't care about performance of incorrect queries.

    Good point -- it would appear that the original query would return Orders as 0 for any client with no orders but the suggested replacement query will surely return 1, would it not?

    Nice catch.

    - Les

  • Good point -- it would appear that the original query would return Orders as 0 for any client with no orders but the suggested replacement query will surely return 1, would it not?

    Yes and that's the only difference.The first query will return 0 for customers having no orders while the second one will return 1.The order count for the remaining records and the total no of rows returned by both the queries will always be the same, not a single record more or less.

    Even if we fix the second query to fix the above problem.

    SELECT

    C.cust_name

    ,C.cust_state

    ,COUNT(O.cust_id) AS ORDERS

    FROM customers C

    LEFT OUTER JOIN [order] O

    ON C.cust_id = O.cust_id

    GROUP BY C.cust_name

    ,C.cust_state

    The first query will always scale over it.

    EDIT : Fixed a typo in the query

    --------------------------------------------------------------------------------------------------
    I am just an another naive wannabe DBA trying to learn SQL Server

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply