APPLY vs. Correlated Scalar Subquery in SELECT clause

  • Hi,

    Is better we use APPLY instead of correlated scalar subqueries in SELECT clause in any case on not?

    For example these two queries will show us the same result:

    First Query (Subquery)

    SELECT t.*, list = (SELECT OrderID + ', '

    FROM Orders

    WHERE customerID = t.customerID

    FOR XML PATH(''))

    FROM Customers T

    Second Query (APPLY)

    SELECT t.*, D.list

    FROM Customers T

    OUTER APPLY (SELECT orderID + ', '

    FROM Orders

    WHERE customerID = T.customerID

    FOR XML PATH('')) D(list)

  • Personally, I'd say yes, use the APPLY operation, but, as with all things, you should test both, look at the execution plans, the performance time, and then make the determination that way.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • I'd say "It Depends". In theory, they should have about the same execution plan. In reality, you need to turn on SQL Profiler and check out things like Duration vs CPU time used, reads, writes, and row counts.

    Then, you need to build a really large test table (I typically gen a million rows to check for scalability) and run it again.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • In terms of performance on queries like your example, I agree there should be very little difference, although as Jeff said, that may not be true for very large datasets.

    However, there is one example where you just cannot use a subquery, but can use APPLY.

    I have not thought of a good reason to do this, but it shows that there is a difference between them.

    You cannot use an aggregate on a subquery, whereas you can on an APPLY.

    For example

    -- This is not allowed

    SELECT g.GroupName ,COUNT( (SELECT a.id FROM newsdata.dbo.Articles AS a WHERE a.GroupName = g.GroupName ) ) AS id_count

    FROM newsdata.dbo.Groups AS g

    GROUP BY g.GroupName

    -- you get this error

    -- Msg 130, Level 15, State 1, Line 3

    -- Cannot perform an aggregate function on an expression containing an aggregate or a subquery.

    -- this IS allowed

    SELECT g.GroupName ,COUNT( ax.id ) AS id_count

    FROM newsdata.dbo.Groups AS g

    OUTER APPLY (SELECT a.id FROM newsdata.dbo.Articles AS a WHERE a.GroupName = g.GroupName ) AS ax(id)

    GROUP BY g.GroupName

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • mister.magoo (4/28/2010)


    In terms of performance on queries like your example, I agree there should be very little difference, although as Jeff said, that may not be true for very large datasets.

    However, there is one example where you just cannot use a subquery, but can use APPLY.

    I have not thought of a good reason to do this, but it shows that there is a difference between them.

    You cannot use an aggregate on a subquery, whereas you can on an APPLY.

    For example

    -- This is not allowed

    SELECT g.GroupName ,COUNT( (SELECT a.id FROM newsdata.dbo.Articles AS a WHERE a.GroupName = g.GroupName ) ) AS id_count

    FROM newsdata.dbo.Groups AS g

    GROUP BY g.GroupName

    -- you get this error

    -- Msg 130, Level 15, State 1, Line 3

    -- Cannot perform an aggregate function on an expression containing an aggregate or a subquery.

    -- this IS allowed

    SELECT g.GroupName ,COUNT( ax.id ) AS id_count

    FROM newsdata.dbo.Groups AS g

    OUTER APPLY (SELECT a.id FROM newsdata.dbo.Articles AS a WHERE a.GroupName = g.GroupName ) AS ax(id)

    GROUP BY g.GroupName

    Not sure why would you even use APPLY on that case instead of :

    SELECT g.GroupName , COUNT( a.id ) AS id_count

    FROM newsdata.dbo.Groups AS g

    INNER JOIN newsdata.dbo.Articles a

    ON a.GroupName = g.GroupName

    GROUP BY g.GroupName


    * Noel

  • noeld (4/28/2010)


    ...

    Not sure why would you even use APPLY on that case instead of :

    SELECT g.GroupName , COUNT( a.id ) AS id_count

    FROM newsdata.dbo.Groups AS g

    INNER JOIN newsdata.dbo.Articles a

    ON a.GroupName = g.GroupName

    GROUP BY g.GroupName

    Exactly, I am pretty sure you wouldn't...but the point was to show that there are situations where one works and the other doesn't... not to suggest that particular query was desirable.

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

    Viewing 6 posts - 1 through 5 (of 5 total)

    You must be logged in to reply to this topic. Login to reply