Excel Function Trimmean

  • Hi All,

    I'm trying to find the function Trimmean to be used in a query but it has been a little challenging.   I'm assuming that a user function has to be developed but haven't been able to find any help online.  Anyone have any ideas or ever attempted this? 

    Thanks-jb

  • Trimmean

    What should this function do? Do you mind giving some more information on what you want to do?

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Aah, found the translation to the German Excel function. This might look a little bit ugly, but see if it helps:

    SET NOCOUNT ON

    CREATE TABLE #Trimmean

    (

     col1 INTEGER NOT NULL PRIMARY KEY

    )

    INSERT INTO #Trimmean (col1) VALUES(1)

    INSERT INTO #Trimmean (col1) VALUES(12)

    INSERT INTO #Trimmean (col1) VALUES(32)

    INSERT INTO #Trimmean (col1) VALUES(14)

    INSERT INTO #Trimmean (col1) VALUES(16)

    INSERT INTO #Trimmean (col1) VALUES(17)

    INSERT INTO #Trimmean (col1) VALUES(1100)

    SET NOCOUNT OFF

    SELECT

     SUM(Summe)*1.0/(SELECT COUNT(*)-2 FROM #Trimmean)

    FROM

     (

     SELECT

      F1.col1 Summe

     FROM

      #Trimmean AS F1

      , #Trimmean AS F2

     GROUP BY

       F1.col1

     HAVING

      F1.col1 BETWEEN MIN(F2.col1)+1 AND MAX(F2.col1)-1

    &nbsp T1

    DROP TABLE #Trimmean

                              

    --------------------------

    18.200000000000

    (1 row(s) affected)

    It's the same result I get in Excel.

    HTH

     

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • If the values are arranged in a numerical order then TRIMMEAN ignores a specific number of values at either end of the list and then returns the AVERAGE of the remaining articles.

    For example, if you were to do a straight AVERAGE of employee salaries you would probably find that directors pay would skew the result and 99% of employees would have below average salaries.

  • Yes, once I found the German translation, it was pretty clear.

    Curious how the results are skewed, when you compare who gets the most money and who does the most work. I guess you'll find a high negative correlation close to -1.

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Ah, took me a while to figure out what trimmean did.. (found the swedish translation)

    But I think it's missing the trim percentage parameter, not? If I read the help correctly, the function should trim a given percentage of rows - half from top and half from bottom, and in any case, at least one row from top and one from bottom. This suggests that more than one row from top/bottom may be removed before the average is calculated depending on how high the trim-percentage is.. right?

    Until we get support for SELECT TOP @var the only way I can write a function like this is by resorting to dynamic SQL...

    /Kenneth

  • Hm, reread the Excel help. Yes, apparently I missed the percentage parameter. That's interesting. I think, I'll have a play with it this evening.

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • I think I know how to do it, but until Yukon, it's dynamic SQL only

    /Kenneth

  • Okay, hardcoded percentage. But what about

    SET NOCOUNT ON

    CREATE TABLE #Trimmean

    (

     col1 INTEGER NOT NULL PRIMARY KEY

    )

    INSERT INTO #Trimmean (col1) VALUES(1)

    INSERT INTO #Trimmean (col1) VALUES(12)

    INSERT INTO #Trimmean (col1) VALUES(32)

    INSERT INTO #Trimmean (col1) VALUES(14)

    INSERT INTO #Trimmean (col1) VALUES(16)

    INSERT INTO #Trimmean (col1) VALUES(17)

    INSERT INTO #Trimmean (col1) VALUES(1100)

    SET NOCOUNT OFF

    SELECT

     SUM(Summe)*1.0/COUNT(Summe)

    FROM

     (

     SELECT col1 AS Summe FROM #Trimmean

     WHERE

      col1 NOT IN (SELECT TOP 20 PERCENT col1 FROM #Trimmean ORDER BY col1)

      AND

      col1 NOT IN (SELECT TOP 20 PERCENT col1 FROM #Trimmean ORDER BY col1 DESC)  

    &nbsp T1

    DROP TABLE #Trimmean

                              

    --------------------------

    15.666666666666

    (1 row(s) affected)

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • (since col1 is int, I also return an int)

    Here's my take on it - assuming a 20% trim on the overall data.

    select  avg(z.col1) as mean

    from

          ( select max(x.col1) as col1

     from

           ( select top 40 percent col1

      from #trimmean

      order by col1 asc

           ) x

     union all

     select min(y.col1) as col1

     from

           ( select top 40 percent col1

      from #trimmean

      order by col1 desc

           ) y

          ) z

    mean          

    --------------

    15

    (1 row(s) affected)

    BTW, if you set both TOP (x and y) to 50, you end up with the setbased way to calculate the median

    -- edit --

    uhoh - that was infact the median query, but trimmed, so it's not quite a trimmean, but a trimmedian (which makes no difference from a median anyway)

    Have to go, fix the trimmean tomorrow perhaps

    /Kenneth

  • Yes, the set-based way for the median. But have a close look here http://groups.google.com/groups?hl=de&lr=&th=c8786f747b6cf92d&rnum=1 , compare the execution plans for both alternatives (TOP 50 blah blah and the solution in the link) and join my Steve Kass fan club. Let alone the fact that he posted the UDF approach to a SQL Server 7.0 question...

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Well, Steve has got a lot better head for numbers than I do

    This trimming business seems to be trickier than it first seemed. Looked at it for a while, but haven't time right now for any further play. It seems to me that one would require a sorted list (ie a temptable with a rowcounter) at the minimum in order to do this according to the Excel function spec.

    In short: (what think needs to be done is...) sort all values - remove the first and last row - remove from top and bottom each half of the given trim percentage of rows - calculate the average of what's left.

    Is probably doable, but will definetly require some steps...

    /Kenneth

     

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply