Implicit conversion comparing INT column to 0

  • In a stored procedure, the following code is causing a huge read and CPU load when it really shouldn't. The @IDParameter below is coming in as a parameter to the proc.

    Here's the snippet of code where the problem is coming in:

    DECLARE @ID INT;

    SET @ID = (SELECT ID From OtherTable WHERE FKID = @IDParameter);

    SELECT COUNT(*)

    FROM LargeTable

    WHERE MostlyZeroID = @ID AND MostlyZeroID > 0

    Most (90+%) of the MostlyZeroID rows are 0 (hence the name) but regardless of distribution this should evaluate with minimal work on SQL Server's part to 0. However, when this was run, it is using a ton of CPU and doing a ton of Reads as it seeks through the entire index. When I look at the execution plan, I see under the seek predicate a Scalar Operator(CONVERT_IMPLICIT(int,[@1],0)) which is what is destroying the performance.

    I've confirmed that the MostlyZeroID column in the LargeTable is defined as an INT NOT NULL. I also tested the scenario outside the stored procedure without any variables as the following to make sure it wasn't some kind of strange parameter sniffing scenario:

    SELECT COUNT(*)

    FROM LargeTable

    WHERE MostlyZeroID = 0 AND MostlyZeroID > 0

    However, this query also did the implicit conversion.

    I then tried this out on a temp table populated with a similar number of records (100 million) with a similar distribution and I didn't get the implicit conversion (I got a constant scan as I would've expected) when I did this:

    SELECT COUNT(*)

    FROM #TestTable

    WHERE MostlyZero = 0 AND MostlyZero > 0

    I also tried the same on several other tables that are set up similarly (large amount of zeros in an INT column) and I always got a constant scan and didn't do an implicit conversion.

    Can anyone explain why the query engine is interpreting this 0 as something other than an INT and doing an implicit conversion when getting the count in the scenario above? What can be done to protect against it? In the above scenario, an IF @ID > 0 statement was placed before the code including the count since there was no reason to even run the code if the @ID was equal to zero.

  • An implicit conversion of a constant or variable to a different data type shouldn't cause performance issues, since that's a one-time thing, and it's being done to prevent the much worse situation of having to implicitly convert the table column, which would cause a big performance issue.

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • I think the problem is that because of the conversion, it has to go through the whole index to find that there is in fact no value that = 0 and is > 0. When it does the implicit conversion, the query on my table does about 1,000,000 reads and uses ~9000 CPU when I run a trace. Without the implicit conversion, it's 3 reads and 0 CPU.

    Do you have any idea why it's doing an implicit conversion? I thought by default when you used a number in a query it treated it as an INT. It seems to do that in most other scenarios I can find and test.

  • Would need to see the actual query plan to see what that implicit conversion is so significant. There must be something going on with a table column, not just a variable(s).

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • Thanks for taking the time to reply Scott. Attached is the query plan from my smaller test copy of the database (there's only a couple million rows in the table).

    Query:

    SELECT COUNT(*)

    FROM [dbo].[TestTable]

    WHERE [MostlyZero] > 0 AND [MostlyZero] = 0

    Set Stats IO and Time ON output:

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 0 ms.

    (1 row(s) affected)

    Table 'TestTable'. Scan count 1, logical reads 55288, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    (1 row(s) affected)

    SQL Server Execution Times:

    CPU time = 655 ms, elapsed time = 660 ms.

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

  • Thanks for the follow up.

    Interesting, and bizarre. We can prevent the implicit conversion using CAST. But we still have to recompile even to avoid a full query scan with *literal* values (look at plan w/ and w/o the RECOMPILE option) . How lame for the optimizer!

    SELECT COUNT(*)

    FROM [dbo].[TestTable]

    WHERE [MostlyZero] > CAST(0 AS int) AND [MostlyZero] = CAST(0 AS int)

    OPTION (RECOMPILE)

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • Strange. It actually turns out, that just putting the OPTION(Recompile) at the bottom of the query without the casts also works. I've tried this on other tables (both regular tables and temp tables) and I didn't have this problem on any of them. It seems that the problem stems from how the query analyzer is interpreting the 0 just for this table, but I don't see anything different about it compared to the others I've tried it on. When I look at the plan without the OPTION(Recompile), Query #1 at the top of the plan is:

    SELECT COUNT(*) FROM [dbo].[TestTable] WHERE [MostlyZero] > @1 AND [MostlyZero]= @2

    Whereas with the OPTION(Recompile) Query #1 at the top of the plan is:

    SELECT COUNT(*) FROM [dbo].[TestTable] WHERE [MostlyZero] > 0 AND [MostlyZero]= 0

    Here's the example temp table I used when trying to recreate this:

    CREATE TABLE #TestTable

    (

    ID int NOT NULL IDENTITY(1, 1) PRIMARY KEY CLUSTERED,

    Name varchar(50) NOT NULL,

    Age int NOT NULL,

    MostlyZero INT NOT NULL

    )

    INSERT #TestTable (Name, Age, MostlyZero)

    SELECT 'John', s1.number % 10 + 25, 0

    FROM master.dbo.spt_values s1

    CROSS JOIN master.dbo.spt_values s2

    CROSS JOIN master.dbo.spt_values s3

    WHERE s1.type = 'P' AND s2.type = 'P' AND s2.type = 'P'

    AND s2.number < 20

    CREATE INDEX IX_#TestTable_MostlyZero ON #TestTable (MostlyZero)

    UPDATE #TestTable

    SET MostlyZero = 612452 + ID

    WHERE ID%300000 = 2

    SELECT COUNT(*)

    FROM #TestTable

    WHERE MostlyZero > 0 AND MostlyZero = 0

    Both the query as it is above and the query with the OPTION(Recompile) show this as Query #1:

    SELECT COUNT(*) FROM #TestTable WHERE MostlyZero > 0 AND MostlyZero = 0

  • No conversion is taking place when I test your code... Perhaps different collation settings???

    (I tested with SQL_Latin1_General_CP1_CI_AS)

  • Looks like I'm using the same collation as you. I checked and the database, table and column are all using SQL_Latin1_General_CP1_CI_AS collation.

    Like I said though, there are several other tables that are structurally the same (as far as I can see) that don't suffer from the same problem when I query them.

  • I was just using CAST to show that it would avoid the implicit conversion, but it still didn't resolve the need to recompile. This is just a failure on the part of the optimizer if you ask me.

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • Looks like you have a real head scratcher on your hands... Nothing else comes to mind at the moment...

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply