Subqueries

  • I have a query with many subqueries that use the same filters. Is there a way I can list the filters only once?

    SELECT MetroCity AS City, COUNT(*) AS CountMonthOne

    , (SELECT COUNT(*) FROM Property a WHERE a.MetroCity = s.MetroCity

    AND SaleDate Between '2005-08-01' AND '2005-08-31'

    AND County = 'King'

    AND MetroCity IS NOT NULL

    AND PropertyUseCode = 'Single Family'

    AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5

    ) AS CountMonthTwo

    , AVG(SalePrice) AS AverageMonthOne

    , (SELECT AVG(SalePrice) FROM Property b WHERE b.MetroCity = s.MetroCity

    AND SaleDate Between '2005-08-01' AND '2005-08-31'

    AND County = 'King'

    AND MetroCity IS NOT NULL

    AND PropertyUseCode = 'Single Family'

    AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5

    ) AS AverageMonthTwo

    , (SELECT COUNT(*) FROM Property c WHERE c.MetroCity = s.MetroCity

    AND SaleDate Between '2006-01-01' AND '2006-08-31'

    AND County = 'King'

    AND MetroCity IS NOT NULL

    AND PropertyUseCode = 'Single Family'

    AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5

    ) AS CountYTDOne

    , (SELECT COUNT(*) FROM Property d WHERE d.MetroCity = s.MetroCity

    AND SaleDate Between '2005-01-01' AND '2005-08-31'

    AND County = 'King'

    AND MetroCity IS NOT NULL

    AND PropertyUseCode = 'Single Family'

    AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5

    ) AS CountYTDTwo

    , (SELECT AVG(SalePrice) FROM Property e WHERE e.MetroCity = s.MetroCity

    AND SaleDate Between '2006-01-01' AND '2006-08-31'

    AND County = 'King'

    AND MetroCity IS NOT NULL

    AND PropertyUseCode = 'Single Family'

    AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5

    ) AS AverageYTDOne

    , (SELECT AVG(SalePrice) FROM Property f WHERE f.MetroCity = s.MetroCity

    AND SaleDate Between '2005-01-01' AND '2005-08-31'

    AND County = 'King'

    AND MetroCity IS NOT NULL

    AND PropertyUseCode = 'Single Family'

    AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5

    ) AS AverageYTDTwo

    FROM dbo.Property s

    WHERE SaleDate Between '2006-08-01' AND '2006-08-31'

    AND County = 'King'

    AND MetroCity IS NOT NULL

    AND PropertyUseCode = 'Single Family'

    AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5

    GROUP BY MetroCity, s.MetroCity

    ORDER BY MetroCity

  • Create table UDF and supply changed values as parameters to it.

    Do your aggregations from the values returned by this UDF.

    _____________
    Code for TallyGenerator

  • I couldn't test it but the following should work and makes the code a little less redundant.

    select metrocity

          ,saledate

          ,saleprice

       into #temp_

       from property

       where saledate between '2005-01-01' and '2005-08-31'

          or saledate between '2006-01-01' and '2006-08-31'

         AND County = 'King'

         AND MetroCity IS NOT NULL

         AND PropertyUseCode = 'Single Family'

         AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5

    select metrocity as city

           ,count(*) as CountMonthOne

           ,(SELECT COUNT(*) FROM #temp t WHERE t.MetroCity = s.MetroCity and saledate between '2005-08-01' and '2005-08-31') as CountMonthTwo

           ,AVG(SalePrice) AS AverageMonthOne

           ,(SELECT AVG(SalePrice) FROM #temp t WHERE t.MetroCity = s.MetroCity and saledate between '2005-08-01' and '2005-08-31') as AverageMonthTwo

           --,etc......

       FROM dbo.Property s

       WHERE SaleDate Between '2006-08-01' AND '2006-08-31'

         AND County = 'King'

         AND MetroCity IS NOT NULL

         AND PropertyUseCode = 'Single Family'

         AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5

       GROUP BY MetroCity, s.MetroCity

       ORDER BY MetroCity

    --James

  • Here's a solution that gets rid of all subqueries. It could be written simpler with CTE if you have SS2005. If you use the temporary table, I would still use the unions in a derived table rather than the subqueries--and the where clause in the unions would just be "WHERE SaleDate between...". BTW, I got much better performance out of both our queries by creating one index on County, PropertyUseCode and SaleDate.

    select  MetroCity as City, 
            Count(MonthOne) as CountMonthOne, Count(MonthTwo) as CountMonthTwo,
            Avg(MonthOne)   as AvgMonthOne,   Avg(MonthTwo)   as AvgMonthTwo,
            Count(YTDOne)   as CountYTDOne,   Count(YTDTwo)   as CountYTDTwo,   
            Avg(YTDOne)     as AvgYTDOne,     Avg(YTDTwo)     as AvgYTDTwo
    from    (
        select  MetroCity, SalePrice as MonthOne, null as YTDOne, null as MonthTwo, null as YTDTwo
        from    dbo.Property
        where   MetroCity IS NOT NULL
            AND County          = 'King'
            and PropertyUseCode = 'Single Family'
            AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5
            and SaleDate Between '2006-08-01' AND '2006-08-31'
        union all
        select  MetroCity, null as MonthOne, SalePrice as YTDOne, null as MonthTwo, null as YTDTwo
        from    dbo.Property
        where   MetroCity IS NOT NULL
            AND County          = 'King'
            and PropertyUseCode = 'Single Family'
            AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5
            and SaleDate Between '2006-01-01' AND '2006-08-31'
        union all
        select  MetroCity, null as MonthOne, null as YTDOne, SalePrice as MonthTwo, null as YTDTwo
        from    dbo.Property
        where   MetroCity IS NOT NULL
            AND County          = 'King'
            and PropertyUseCode = 'Single Family'
            AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5
            and SaleDate Between '2005-08-01' AND '2005-08-31'
        union all
        select  MetroCity, null as MonthOne, null as YTDOne, null as MonthTwo, SalePrice as YTDTwo
        from    dbo.Property
        where   MetroCity IS NOT NULL
            AND County          = 'King'
            and PropertyUseCode = 'Single Family'
            AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5
            and SaleDate Between '2005-01-01' AND '2005-08-31'
        ) as P
    order by MetroCity
    group by MetroCity;

    Tomm Carr
    --
    Version Normal Form -- http://groups.google.com/group/vrdbms

  • If Tomm's solution works then this might be a bit faster because it should only do one pass thru the table. Since I have no data I could not test this...

    select

    MetroCity as City,        

    Count(MonthOne) as CountMonthOne,

    Count(MonthTwo) as CountMonthTwo,

    Avg(MonthOne)   as AvgMonthOne,

    Avg(MonthTwo)   as AvgMonthTwo,

    Count(YTDOne)   as CountYTDOne,

    Count(YTDTwo)   as CountYTDTwo,

    Avg(YTDOne)     as AvgYTDOne,

    Avg(YTDTwo)     as AvgYTDTwo

    from   

    (   

    select 

    MetroCity,

    CASE

     WHEN SaleDate Between '2006-08-01' AND '2006-08-31' THEN SalePrice

     ELSE NULL

    END AS MonthOne,

    CASE

     WHEN SaleDate Between '2006-01-01' AND '2006-08-31' THEN SalePrice

     ELSE NULL

    END AS YTDOne,

    CASE

     WHEN SaleDate Between '2005-08-01' AND '2005-08-31' THEN SalePrice

     ELSE NULL

    END AS MonthTwo,

    CASE

     WHEN SaleDate Between '2005-01-01' AND '2005-08-31' THEN SalePrice

     ELSE NULL

    END AS YTDTwo   

    from   

    dbo.Property   

    where   MetroCity IS NOT NULL       

    AND County          = 'King'       

    and PropertyUseCode = 'Single Family'       

    AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5       

    ) as P

    order by MetroCitygroup by MetroCity;

    ---------------------------------------------
    [font="Verdana"]Nothing is impossible.
    It is just a matter of time and money.[/font]

  • Thanks everyone for your input.  These were all great suggestions.  They all work good and when comparing speed only, here are the results:

    The one I posted takes about 4 seconds

    JLK: 1 second to create the temp table and then 1 second to get the result set. I combined both sets into one stored procedure and then got the result set in 2 seconds and then in 1 second in subsequent runs.

    Tomm Carr: 1st run: 23 seconds, 2nd run: 5 seconds; Subsequent runs: 1-2 seconds

    JacekO: 1st run: 22 seconds, 2nd run: 4 seconds; Subsequent runs: 1-2 seconds

    There are 1,338,224 records in the table.  Thanks again!

    Note:  I did need to make one correction to JLK's query:   where (saledate between '2005-01-01' and '2005-08-31'  or saledate between '2006-01-01' and '2006-08-31').  Need to confine the "OR."  For the other two, needed to put GROUP BY before ORDER BY.

  • Wow. I generally do all I can to avoid temp tables but with almost 1.5M records, apparently the initial overhead averages out. Just for grins, how good is the hybrid of the best. Test this out:

    select  metrocity, saledate, saleprice
    into    #temp
    from    property
    where   MetroCity IS NOT NULL
        AND County          = 'King'
        AND PropertyUseCode = 'Single Family'
        AND SalePrice       >= (ApprLandVal + ApprImpsVal) * .5
        AND (saledate between '2005-01-01' and '2005-08-31'
            or saledate between '2006-01-01' and '2006-08-31');
    
    select
            MetroCity as City,        
            Count(MonthOne) as CountMonthOne,
            Count(MonthTwo) as CountMonthTwo,
            Avg(MonthOne)   as AvgMonthOne,
            Avg(MonthTwo)   as AvgMonthTwo,
            Count(YTDOne)   as CountYTDOne,
            Count(YTDTwo)   as CountYTDTwo,
            Avg(YTDOne)     as AvgYTDOne,
            Avg(YTDTwo)     as AvgYTDTwo
    from   (   
        select 
            MetroCity,
            CASE WHEN SaleDate Between '2006-08-01' AND '2006-08-31' 
                THEN SalePrice
                ELSE NULL
            END AS MonthOne,
            CASE WHEN SaleDate Between '2006-01-01' AND '2006-08-31' 
                THEN SalePrice
                ELSE NULL
            END AS YTDOne,
            CASE WHEN SaleDate Between '2005-08-01' AND '2005-08-31' 
                THEN SalePrice
                ELSE NULL
            END AS MonthTwo,
            CASE WHEN SaleDate Between '2005-01-01' AND '2005-08-31' 
                THEN SalePrice
                ELSE NULL
            END AS YTDTwo   
        from    #Temp) as P
    group by MetroCity
    order by MetroCity;

    What are the numbers on that?

    Tomm Carr
    --
    Version Normal Form -- http://groups.google.com/group/vrdbms

  • Tom,

    I ran both steps together and got the results in about 1 to 2 seconds!

  • And since there's only one SELECT from #temp...
     
    select
            MetroCity as City,        
            Count(MonthOne) as CountMonthOne,
            Count(MonthTwo) as CountMonthTwo,
            Avg(MonthOne)   as AvgMonthOne,
            Avg(MonthTwo)   as AvgMonthTwo,
            Count(YTDOne)   as CountYTDOne,
            Count(YTDTwo)   as CountYTDTwo,
            Avg(YTDOne)     as AvgYTDOne,
            Avg(YTDTwo)     as AvgYTDTwo
    from   (   
        select 
            MetroCity,
            CASE WHEN SaleDate Between '2006-08-01' AND '2006-08-31' 
                THEN SalePrice
                ELSE NULL
            END AS MonthOne,
            CASE WHEN SaleDate Between '2006-01-01' AND '2006-08-31' 
                THEN SalePrice
                ELSE NULL
            END AS YTDOne,
            CASE WHEN SaleDate Between '2005-08-01' AND '2005-08-31' 
                THEN SalePrice
                ELSE NULL
            END AS MonthTwo,
            CASE WHEN SaleDate Between '2005-01-01' AND '2005-08-31' 
                THEN SalePrice
                ELSE NULL
            END AS YTDTwo   
        from    (select  metrocity, saledate, saleprice
    		from    property
    		where   MetroCity IS NOT NULL
        		AND County          = 'King'
        		AND PropertyUseCode = 'Single Family'
        		AND SalePrice       >= (ApprLandVal + ApprImpsVal) * .5
        		AND (saledate between '2005-01-01' and '2005-08-31'
            		or saledate between '2006-01-01' and '2006-08-31')
    	) t 
    	) as P
    group by MetroCity
    order by MetroCity;
    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • David: Glad I/we could help, and yes I did forget the parenthesis to group the "OR" clause.  I'm also glad to see you got a lot of help/ideas on how to best write the query.  I'm suprised that the time with the temp table and all the coorelated sub-queries was faster than a few of the others posted.   Just goes to show that sometimes, simplier is better (though like most folks I think try to avoid them if possible) Thanks for the feed back on the posts.

    Good Luck, James.

  • Instead of using #temp table use the table variable. It should work faster and is not creating any locks on the master db. And yes, sometimes when the initial record set is large I also preload the subset I need into a table variable in order to maximize performance.

    And thanks for posting back, with the results.

    ---------------------------------------------
    [font="Verdana"]Nothing is impossible.
    It is just a matter of time and money.[/font]

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply