How to find average when some of the fields contain null value?

  • Dear All,

    My query is related to handling null values in calculating the average at row level. Suppose we have 4 columns.

    COL1 COL2 COL3 COL4

    11 12 11 NULL

    24 22 NULL 33

    44 NULL NULL 44

    55 NULL NULL 55

    My query is suppose we need to get average of all the 4 columns i.e. (COL1+COL2+COL3+COL4)/4. As some of column contains null value so we cannot use above method for finding the average i.e. col1+col2.... Also we cannot replace NULL with 0 using ISNULL() as it reduces the average value, in short we need to take only those columns in count which has value and need to hide null values i.e. in first row avg should be of (col1+col2+col3)/3.

    Please let me know if anyone has faced this scenario in past and what solution he/she opted?

  • abhisheksrivastava85 (5/16/2011)


    Dear All,

    My query is related to handling null values in calculating the average at row level. Suppose we have 4 columns.

    COL1 COL2 COL3 COL4

    11 12 11 NULL

    24 22 NULL 33

    44 NULL NULL 44

    55 NULL NULL 55

    My query is suppose we need to get average of all the 4 columns i.e. (COL1+COL2+COL3+COL4)/4. As some of column contains null value so we cannot use above method for finding the average i.e. col1+col2.... Also we cannot replace NULL with 0 using ISNULL() as it reduces the average value, in short we need to take only those columns in count which has value and need to hide null values i.e. in first row avg should be of (col1+col2+col3)/3.

    Please let me know if anyone has faced this scenario in past and what solution he/she opted?

    Something like this might work - sorry I've not got access to SQL server to check the syntax right now...

    (ISNULL(COL1,0)+ISNULL(COL2,0)+ISNULL(COL3,0)+ISNULL(COL4,0))

    /

    (ISNULL(SIGN(COL1),0)+ISNULL(SIGN(COL2),0)+ISNULL(SIGN(COL3),0)+ISNULL(SIGN(COL4),0))

    EDIT: You need to handle when all columns are null - as well! not handled in sample above.

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • Probably a much more elegant solution that this, but here you go: -

    --==Sample Data==--

    DECLARE @TABLE AS TABLE (pkey INT IDENTITY, col1 INT, col2 INT, col3 INT, col4 INT)

    INSERT INTO @TABLE (col1, col2, col3, col4)

    SELECT 11, 12, 11, NULL

    UNION ALL SELECT 24, 22, NULL, 33

    UNION ALL SELECT 44, NULL, NULL, 44

    UNION ALL SELECT 55, NULL, NULL, 55

    --==Not Pretty - Solution==--

    SELECT e.pkey,

    (MAX(ISNULL(e.col1,0))+MAX(ISNULL(e.col2,0))+MAX(ISNULL(e.col3,0))+MAX(ISNULL(e.col4,0))) /

    (SUM(ISNULL(a.col1,0))+SUM(ISNULL(b.col2,0))+SUM(ISNULL(c.col3,0))+SUM(ISNULL(d.col4,0)))

    FROM (SELECT pkey, 1 AS col1

    FROM @TABLE

    WHERE col1 IS NOT NULL) a

    LEFT OUTER JOIN (SELECT pkey, 1 AS col2

    FROM @TABLE

    WHERE col2 IS NOT NULL) b ON a.pkey = b.pkey

    LEFT OUTER JOIN (SELECT pkey, 1 AS col3

    FROM @TABLE

    WHERE col3 IS NOT NULL) c ON a.pkey = c.pkey

    LEFT OUTER JOIN (SELECT pkey, 1 AS col4

    FROM @TABLE

    WHERE col4 IS NOT NULL) d ON a.pkey = d.pkey

    LEFT OUTER JOIN @TABLE e ON e.pkey = a.pkey

    GROUP BY e.pkey


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Two ways i could think of:

    METHOD 1

    DECLARE @Table Table

    (

    COL1 INT,

    COL2 INT,

    COL3 INT,

    COL4 INT

    )

    INSERT INTO @Table VALUES

    (11, 12, 11, NULL )

    ,(24, 22, NULL, 33 )

    ,(44, NULL, NULL, 44 )

    ,(55, NULL, NULL, 55 )

    --=== Method 1 ( USING ISNULL AND CASE )

    SELECT ( ISNULL(COL1 ,0 ) + ISNULL(COL2 ,0 ) + ISNULL(COL3 ,0 ) + ISNULL(COL4 ,0 ) )

    / (( CASE WHEN COL1 IS NOT NULL THEN 1 ELSE 0 END ) +

    ( CASE WHEN COL2 IS NOT NULL THEN 1 ELSE 0 END ) +

    ( CASE WHEN COL3 IS NOT NULL THEN 1 ELSE 0 END ) +

    ( CASE WHEN COL4 IS NOT NULL THEN 1 ELSE 0 END )

    ) AS AVG_COL

    FROM @Table

    METHOD 2

    --==== Method 2 ( USING UNPIVOT )

    SELECT AVG ( Vals ) AVG_COL

    FROM (SELECT Row_Num = ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) ,

    COL1 , COL2 , COL3 , COL4

    FROM @Table ) UNPIVOT_SOURCE

    UNPIVOT

    ( Vals FOR Cols IN ( COL1 , COL2 , COL3 , COL4 )) UNPIVOT_HANDLER

    GROUP BY

    Row_Num

  • Thanks a lot for reply man, but it is returning average as 0 when i am using Sign function.

  • I think the method with the SIGN functions should read :

    (SIGN(ISNULL(Col1,0))+SIGN(ISNULL(Col2,0))+SIGN(ISNULL(Col3,0))+SIGN(ISNULL(Col4,0)))

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • No its still giving 0, i mean when i changed the syntax and made it sign(isnull(col1,0)), it is still not calculating it 🙁

  • Have u tried the other solutions posted?

  • abhisheksrivastava85 (5/16/2011)


    Thanks a lot for reply man, but it is returning average as 0 when i am using Sign function.

    Check your data and then your code. The method works just fine.

    CREATE TABLE #Table (COL1 INT, COL2 INT, COL3 INT, COL4 INT)

    INSERT INTO #Table (COL1, COL2, COL3, COL4) VALUES (1, 1, 1, 1)

    INSERT INTO #Table (COL1, COL2, COL3, COL4) VALUES (1, 1, 1, NULL)

    INSERT INTO #Table (COL1, COL2, COL3, COL4) VALUES (1, NULL, 1, NULL)

    INSERT INTO #Table (COL1, COL2, COL3, COL4) VALUES (NULL, NULL, NULL, NULL)

    SELECT (ISNULL(SIGN(COL1),0)

    +ISNULL(SIGN(COL2),0)

    +ISNULL(SIGN(COL3),0)

    +ISNULL(SIGN(COL4),0))

    FROM #Table

    Try casting the divisor as a decimal with the desired degree of precision. If both divisor and dividend are integer and the quotient is less than 1, the result will be integer 0.


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • I can validate that based on the sameple data that you provided in the original post Chris's method works as intended.

    The averages, for each row is 11.33,26.33,44.00,55.00.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • The solutions posted using sign will not work correctly if the value is negative.

    Also, they need to account for the possibility that all four columns are null.

    create table #table

    (col1 numeric(10,4), col2 numeric(10,4), col3 numeric(10,4), col4 numeric(10,4))

    go

    insert into #table (col1, col2, col3, col4) values (3, 7, -2, 4)

    insert into #table (col1, col2, col3, col4) values (4, -4, 7, null)

    insert into #table (col1, col2, col3, col4) values (-4, 4, -7, null)

    insert into #table (col1, col2, col3, col4) values (1, null, 1, null)

    insert into #table (col1, col2, col3, col4) values (-1, null, 2, null)

    insert into #table (col1, col2, col3, col4) values (null, null, null, null)

    go

    select AverageCol1234 =

    convert(numeric(10,4),round(

    (isnull(col1,0.0000)+isnull(col2,0.0000)+

    isnull(col3,0.0000)+isnull(col4,0.0000))

    /

    nullif( case when col1 is null then 0.0000 else 1.0000 end +

    case when col2 is null then 0.0000 else 1.0000 end +

    case when col3 is null then 0.0000 else 1.0000 end +

    case when col4 is null then 0.0000 else 1.0000 end

    , 0.0000),4))

    from

    #table

    drop table #table

    Results:

    AverageCol1234

    --------------

    3.0000

    2.3333

    -2.3333

    1.0000

    .5000

    NULL

  • I finally had time to work on this....and then once I finished I find Michael had posted a more complete solution...aaaggh!

    Never mind, I will post mine because I have done it now!

    --= CREATE SOME TEST DATA

    IF OBJECT_ID('tempdb..#Table') IS NOT NULL

    DROP TABLE #Table;

    CREATE TABLE #Table (COL1 INT, COL2 INT, COL3 INT, COL4 INT)

    INSERT INTO #Table (COL1, COL2, COL3, COL4) VALUES (11, 12, 11, NULL)

    INSERT INTO #Table (COL1, COL2, COL3, COL4) VALUES (24, 22, NULL, 33)

    INSERT INTO #Table (COL1, COL2, COL3, COL4) VALUES (44, NULL, NULL, 44)

    INSERT INTO #Table (COL1, COL2, COL3, COL4) VALUES (55, NULL, NULL, 55)

    INSERT INTO #Table (COL1, COL2, COL3, COL4) VALUES (0, 0, 0, 0)

    INSERT INTO #Table (COL1, COL2, COL3, COL4) VALUES (NULL, NULL, NULL, NULL)

    INSERT INTO #Table (COL1, COL2, COL3, COL4) VALUES (0, 0, 0, NULL)

    -- The idea here is to count the NON-NULL columns by

    -- converting NULLS to 0 and anything else to 1s.

    -- This is achieved by first using ABS to ensure that

    -- all values are positive, the using SIGN to convert

    -- non-zero values to 1, leaving only ones, zeros and nulls.

    -- The nulls are converted to 0 using ISNULL, then all

    -- the resulting ones and zeros are added to get a column

    -- count.

    -- IF the column count is zero, the result is converted to a NULL

    -- using NULLIF. The resulting value is the divisor.

    --

    -- Next any NULL values in the columns are converted to zero

    -- and the results are added to get a row total.

    -- Now we can divide the row total by the column count (the divisor).

    -- If the result of this calculation is NULL - this means we

    -- either had a row total of zero or ALL NULLS, so we use COALESCE

    -- to test for any column to have a NON-NULL value -

    -- which will be a zero - if we find one, then zero is the result

    -- anything else is undefined - i.e. NULL

    SELECT

    ISNULL(

    CONVERT(

    DECIMAL(10, 2),

    1.0 *

    (

    ISNULL( COL1, 0 ) + ISNULL( COL2, 0 ) + ISNULL( COL3, 0 ) + ISNULL( COL4, 0 )

    )

    /

    NULLIF(

    ISNULL(SIGN(ABS(COL1)),0) + ISNULL(SIGN(ABS(COL2)),0) + ISNULL(SIGN(ABS(COL3)),0) + ISNULL(SIGN(ABS(COL4)),0)

    ,0 )

    ) ,

    COALESCE(COL1,COL2,COL3,COL4)

    )

    FROM #Table

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • Edit: duplicate post - site returned an error when I posted the first time, so I tried again!

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • Thanks a lot UDP Broadcaster 🙂

    This solutions is working fine and i am getting results.

    Thank you all guys for your valuable inputs.

  • But do you think this solution is making our query more expensive because if have more number of columns then using case and isnull in denominator is challenge. And also if db has bulk of data then also it will slower the query.Why i am asking this is because i am going to use this query in stored procedure and that stored procedure i am linking with ssrs report so i think it will slower the execution of report. So guys please help me on this.

    Thanks

    Abhishek

  • Viewing 15 posts - 1 through 15 (of 17 total)

    You must be logged in to reply to this topic. Login to reply