Calcuating median values

  • Hi.  I've been reading this forum for a couple of weeks, and have really learned a lot - so thank you!

     

    I've got the task of calculating the a median value without using cursors or while loops.  I found some code to calculate the median (obtained from http://databases.aspfaq.com/database/how-do-i-calculate-the-median-in-a-table.html):

        SELECT AVG(dailyrate) FROM

    (

        SELECT dailyrate FROM (

            SELECT TOP 1 dailyrate FROM

            (

                SELECT TOP 50 PERCENT dailyrate

                FROM test_table ORDER BY dailyrate

            ) sub_a

            ORDER BY dailyrate DESC

        ) sub_1

        UNION ALL

        SELECT dailyrate FROM (

            SELECT TOP 1 dailyrate FROM

            (

                SELECT TOP 50 PERCENT dailyrate

                FROM test_table ORDER BY dailyrate DESC

            ) sub_b

            ORDER BY dailyrate

        ) sub_2

    ) median

     

    This works fine if I have a table with the DailyRate column by itself.  However, I need to group by other columns.  For example:

     

    I have the fields BookingMonth,OperatingUnit,BookingSource,Daily Rate in the table Stage.

     

    These are the detail records:

    BookingMonth      OperatingUnit     BookingSource     Daily Rate

    200701            APAC              Web               180

    200701            APAC              Web               200

    200701            APAC              Web               240

    200701            EMEA              CallCenter        200

    200701            EMEA              CallCenter        250

     

    This is what is should look like with the median:

    BookingMonth      OperatingUnit     BookingSource     Daily Rate

    200701            APAC              Web               200

    200701            EMEA              CallCenter        225

     

    There’s probably a very obvious answer that I’m missing, but I just can’t see it.  I would appreciate any help.

     

    Thanks,

    Marianne

  • Hi Marianne,

    I agree this site is very good for helping people as well as learning.

    Seeing as though you are keen on learning,

    try reading up about the following clause in SQL, I have just starting using this for similar queries:

    "OVER" clause.

    If you have no luck let me know and I can write a query up for you.

    Thanks

    Chris

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • Chris,

    As you suggested, I just did some experimenting with the "OVER" clause.  I partitioned by BookingMonth, OperatingUnit, and BookingSource, calculating AVG, COUNT, MIN, and MAX.  How is that different from using GROUP BY? - my results were the same either way.  Also, I'm not sure how I would incorprate the code from my original post to calculate the median.

    Thanks, Marianne

  • You are using SQL Server 2005, are you?

    asc desc diff

    1   4    3  -- even number of datapoints

    2   3    1

    3   2   -1

    4   1   -3

    1   3    2  -- odd number of datapoints

    2   2    0

    3   1   -2

    select avg(val) as median from

    (select row_number() over (order by val asc) as sort_asc

    , row_number() over (order by val desc) as sort_desc

    , val

    from table1) as t

    where sort_asc - sort_desc between -1 and 1

     


    N 56°04'39.16"
    E 12°55'05.25"

  • FYI - I managed to work out how to get the info I need when I have an odd number of records (field names differ from original example).  I used NTILE and OVER:

    SELECT bookingsource,

     bookingtype,

     operatingunit,

     max(dailyrate) as MedRate

    FROM

     (SELECT bookingsource,

       bookingtype,

       operatingunit,dailyrate,

       ntile(2) over (partition by  bookingsource,

           bookingtype,

           operatingunit

           ORDER BY dailyrate) med_ntile

     FROM testmedian2) med

    WHERE med_ntile = 1

    GROUP BY  bookingsource,

     bookingtype,

     operatingunit

    Still working on getting it done with an even number of records.  Trying to somehow use "% 2" to determine whether the number of records is odd or even.

    Peter - I still have to try your method.  I'll try it today.

    Thanks, Marianne

  • Peter - So far, it seems to work beautifully.  THANK YOU!!!!

  • select  bookingsource,

      bookingtype,

      operatingunit,

      avg(dailyrate) as average

    from  (

       select bookingsource,

        bookingtype,

        operatingunit,

        row_number() over (partition by bookingsource, bookingtype, operatingunit order by dailyrate) as sort_asc,

        row_number() over (partition by bookingsource, bookingtype, operatingunit order by dailyrate desc) as sort_desc,

        dailyrate

       from testmedian2

     &nbsp as t

    where  sort_asc - sort_desc between -1 and 1

    group by bookingsource,

      bookingtype,

      operatingunit

    order by bookingsource,

      bookingtype,

      operatingunit

     


    N 56°04'39.16"
    E 12°55'05.25"

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply