Sum in Exists Clause

  • I am looking for a lead or a psuedo sql to perform the following: I have a sales header and sales detail tables. The detail table contains a record for each individual item scanned (qty, price, etc.), whereas the header contains the total for that sale and other aggregate info. The information I need is to find the headers that have a particular item X in its detail transactions (and nothing else), and the sum of the price for all instances of item X under that header is > 0.

    Here are general table definitions, and sample data :

    Header :

     HeaderID Sum

     100  50    

     200  10

     300  50

     400  15

    Detail :

     HeaderId  DetailId  Item  Price

     100  1001  X 10

     100  1002  X 10

     100  1003  X -20 

     100  1004  Y 50

     200  2001  X 10

     200  2002  X 10

     300  3001  Y 50

     400  4001  X 5

     400  4002  Y 10

    In this example the query should only return Header 200, since in header 100, sum (price) of X is 0 (10+10-20) hence excluded, and header 300 has no detail records with item X. Though Header 400 has sum of X item sales > 0, it also has Y item sale under the same header hence does not qualify. I am having trouble with how I am going to construct an exists clause that uses aggregates (sum(price) ). Both the tables are fairly huge (over 50 mil headers and 500 million details). I appreciate any help.

    Thanks.

     

  • Select HeaderID

    From Header As h

    Where HeaderID In (

      Select HeaderID

      From Detail

      Where Item = 'X'

      Group By HeaderID

      Having Sum(Price) > 0

    )

    And Not Exists (

      Select *

      From Detail As d

      Where d.HeaderId = h.HeaderID

      And   d.Item <> 'X'

    )

     

    [Edit] Ugh. Just re-read the part about 500 million detail rows. Maybe 2 sub-queries isn't optimal.

    This might perform better:

    Select HeaderID

    From Header As h

    Where HeaderID In (

      Select HeaderID

      From Detail

      Group By HeaderID

      Having Sum( Case Item When 'X' Then Price Else 0 End ) > 0

      And Count(*) = Sum( Case Item When 'X' Then 1 Else 0 End )

    )

    1st part of the having ensures Sum of only X items is greater than zero. 2nd part ensures only 'X' is in the details for a given header.

     

  • Thank you, you have set me in the right direction.

    Is there "ever" a way to use aggregates in the exists clasue, like this?

    (I do not think so, but want to be sure)

     SELECT .....

     FROM .......Details AI

     EXISTS ( (SELECT SUM (XPRICE)

        FROM Details AIC

        WHERE AH.HEADER = AIC.HEADER AND

       AI.ITEMCODE = AIC.ITEMCODE) > 0)  

  • No, the problem is that Exists is testing for existence of at least 1 record, but when you use an aggregate like SUM(), without a Having clause, you always get a record in the result set.

    That's why you need to use Having in order to produce an empty record set if the aggregates do not return the desired values.

  • Because of several conditions may affect the performance on such large table you can try also a modified version of PW solution

    Select HeaderID

    From Header As h

    Where HeaderID In (

      Select HeaderID

      From Detail

      Group By HeaderID

      Having 

          Count(distinct Item) = 1

      And Min(Item) =  'X'

      And Sum(price) > 0 )

    )

    HTH

     


    * Noel

  • Here is another solution involving just joins

    select h.headerid,h.[sum]

    from header h

    inner join

     (select distinct b.headerid as bHID, b.item as bItem

     from detail b

     inner join

     (select a.hid, count(*) as ItemsSold

     from (select distinct headerid as hid,item as itemid  from detail ) a

     group by hid

     having count(*)=1) c

     on c.hid = b.headerid) d

    on h.headerid=d.bHID

    where d.bItem='X' 

    and h.[sum] >0

    Explanation:

    -- list headers, and their items

    select distinct headerid as hid,item as itemid  from detail

    -- determine headers with 1 item

     select a.hid, count(*) as ItemsSold

     from (select distinct headerid as hid,item as itemid  from detail ) a

     group by hid

     having count(*)=1

    --  join with the detail table to pick up the item column that will be used to --- select the specific item later on

            select distinct b.headerid as bHID, b.item as bItem

     from detail b

     inner join

     (select a.hid, count(*) as ItemsSold

     from (select distinct headerid as hid,item as itemid  from detail ) a

     group by hid

     having count(*)=1) c

     on c.hid = b.headerid

    -- now join with the header and apply qualifying conditions

    select h.headerid,h.[sum]

    from header h

    inner join

     (select distinct b.headerid as bHID, b.item as bItem

     from detail b

     inner join

     (select a.hid, count(*) as ItemsSold

     from (select distinct headerid as hid,item as itemid  from detail ) a

     group by hid

     having count(*)=1) c

     on c.hid = b.headerid) d

    on h.headerid=d.bHID

    where d.bItem='X' 

    and h.[sum] >0

    Cheers,

     

    Ben

  • Volume will be the killer here.

    Whatever solution you use must at least aggregate the Detail table. So the choice is where and when. I see two possibilities.

    Create a sub query (or create a temp table which may work faster) to aggregate Details collecting count of X and Y plus SUM(Price), join this to Header with the conditions you require.

    Join the two tables together, group by HeaderID and aggregate as above  (but will have to use CASE statements) and add a HAVING clause to do the criteria

    Far away is close at hand in the images of elsewhere.
    Anon.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply