Quirky Update query...works on server ...fails on PC.

  • Quirky Update query...works on server ...fails on PC most of the time!!

    I have been testing "quirky update" to update running totals.

    I have run the code (see below) on my local PC....Vista 64bit...2Gb RAM

    Microsoft SQL Server 2008 R2 (RTM) - 10.50.1600.1 (X64)

    Enterprise Evaluation Edition (64-bit) on Windows NT 6.0 <X64> (Build 6002: Service Pack 2)

    and it works fine for approx 1M rows.

    ...however it fails 4 times out of 5 when I attempt 10M rows

    "Msg 8134, Level 16, State 1, Line 83

    Divide by zero error encountered.

    The statement has been terminated"

    This I assume is the quirky safetycheck kicking in.

    Runnning the same code for 10M rows on a test server ...Windows Server 2008 64bit (SQL RAM max 12Gb)

    Microsoft SQL Server 2008 R2 (SP1) - 10.50.2500.0 (X64)

    Developer Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1) (Hypervisor)

    it works no problem....every time. (approx 2mins)

    Has anyone experienced this before.....??

    code below...hopefully I have all the neccessary hints and safety checks in the code.

    --+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    --PLEASE NOTE THAT THIS WILL DELETE ANY EXG TABLES WITH SAME NAME IN database tempdb....please amend if required

    --+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    USE [tempdb]--==== start in safe place!!!

    GO

    --====Conditionally delete tables from [tempdb}

    IF Object_id('tempdb..Products', 'U') IS NOT NULL

    DROP TABLE tempdb..Products;

    IF Object_id('tempdb..Locations', 'U') IS NOT NULL

    DROP TABLE tempdb..Locations;

    IF Object_id('tempdb..StockMovements', 'U') IS NOT NULL

    DROP TABLE tempdb..StockMovements;

    -- LOCATIONS

    CREATE TABLE [dbo].[Locations]( [LocID] [int] )

    GO

    INSERT [dbo].[Locations] ([LocID]) VALUES (1)

    INSERT [dbo].[Locations] ([LocID]) VALUES (2)

    INSERT [dbo].[Locations] ([LocID]) VALUES (3)

    INSERT [dbo].[Locations] ([LocID]) VALUES (4)

    INSERT [dbo].[Locations] ([LocID]) VALUES (5)

    INSERT [dbo].[Locations] ([LocID]) VALUES (6)

    INSERT [dbo].[Locations] ([LocID]) VALUES (7)

    INSERT [dbo].[Locations] ([LocID]) VALUES (8)

    INSERT [dbo].[Locations] ([LocID]) VALUES (9)

    --PRODUCTS

    SELECT TOP 10000

    ProductID = IDENTITY(INT, 1, 1)

    INTO Products

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2

    CROSS JOIN sys.all_columns ac3

    --STOCK MOVEMENTS

    --OUTWARD MOVEMENTS

    SELECT TOP 8000000---- NOTE 8 MILLION rows

    TranID = IDENTITY(INT, 1, 1),

    ProdID = 1 +CAST(Abs(Checksum(Newid()) % 9000 ) AS INT),

    LocID = 1 +CAST(Abs(Checksum(Newid()) % 9 ) AS INT),

    Qty = -10 + CAST(Rand(Checksum(Newid())) * 10 AS INT),

    StkTransDate = Dateadd(dd, Abs(Checksum(Newid()))%Datediff(dd, '2009', '2011'), '2009'),

    StkLocProdRunTotal = 0

    INTO StockMovements

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2

    CROSS JOIN sys.all_columns ac3

    --INWARD MOVEMENTS

    INSERT INTO StockMovements

    (ProdID,LocID,Qty,StkTransDate,StkLocProdRunTotal)

    SELECT TOP 2000000

    ProdID = 1 +CAST(Abs(Checksum(Newid()) % 9000 ) AS INT),

    LocID = 1 +CAST(Abs(Checksum(Newid()) % 9 ) AS INT),

    Qty = 20+ CAST(Rand(Checksum(Newid())) * 10 AS INT),

    StkTransDate = Dateadd(dd, Abs(Checksum(Newid()))%Datediff(dd, '2009', '2011'), '2009'),

    StkLocProdRunTotal = 0

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2

    CROSS JOIN sys.all_columns ac3

    --== populate with opening stock

    INSERT INTO StockMovements

    (ProdID,LocID,Qty,StkTransDate,StkLocProdRunTotal)

    SELECT ProdId = Products.ProductID,

    LocId = Locations.LocID,

    Qty = 40 + CAST(Rand(Checksum(Newid())) * 10 AS INT),

    StkTransDate = CONVERT(DATETIME, '2008-12-31 00:00:00', 102),

    StkLocProdRunTotal = 0

    FROM Products

    CROSS JOIN Locations

    CREATE CLUSTERED INDEX [IX_StkMove]

    ON [dbo].[StockMovements] ( [LocID] ASC, [ProdID] ASC, [StkTransDate] ASC, [TranID] ASC )

    ---QUIRKY UPDATE FOR RUNNING TOTAL

    DECLARE

    @LocID INT,

    @ProdID INT,

    @StkLocProdRunTotal INT,

    @RowCounter INT;

    SELECT @StkLocProdRunTotal = 0,

    @RowCounter = 1

    ;WITH

    cteBS AS

    (

    SELECT RowCounter = ROW_NUMBER() OVER(ORDER BY LocId, ProdId, StkTransDate, TranID ),

    LocId, ProdId, StkTransDate, Qty, StkLocProdRunTotal

    FROM StockMovements

    )

    UPDATE target

    SET @StkLocProdRunTotal = StkLocProdRunTotal

    = CASE

    WHEN RowCounter = @RowCounter --SafetyCheck

    THEN

    CASE

    WHEN LocID = @LocID

    AND ProdID = @ProdID

    THEN @StkLocProdRunTotal + Qty

    ELSE Qty

    END

    ELSE 1/0 --Force failure if SafetyCheck is out of sync

    END,

    @LocID = LocID,

    @ProdID = ProdID,

    @RowCounter = @RowCounter + 1

    FROM cteBS target WITH (TABLOCKX)

    OPTION (MAXDOP 1)

    ;

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • I couldn't get it to fail on two different boxes.

    Box 1 is a 9 year old desktop w/single 32 bit CPU, 1GB ram, running 2k5 Dev Edition and Windows XP.

    Box 2 is a 1 year old laptop w/dual 64 bit CPU, 4GB ram, running 2k8 (not R2) Dev Edition and Windows 7.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Hi Jeff

    I am not surprised that you cannot replicate....I have used quirky update on several db's , all without errors...and the purpose behind this was to build a testrig for a fellow colleague who has just joined our team.....;-)

    I tested this code many times before I posted..just to be sure I could replicate it.

    I am wondering if this is a memory resource issue. ??...task manager on my "little" PC was running at about 95% when it threw the errror....and as I write, it has happened again.

    I know very little (sod all) about how to go about delving deeper into what is happening and where it is going wrong......have you any suggestions please?

    The

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • i tested it on 2K5 and 2K8, with ARITHRABORT on and off in both instances(thinking that might be the issue) , and couldn't get it to throw the error;

    my machines' got 8 gig though, decent for a developer machine, so i din't suffer from any ram starving or anything like that.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • 3 successful runs on home lappy, 2K8, Vista.


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • Absolutely irrelevant, of course, but I just had to say that I absolutely love your handle, Jonathan Livingston Seagull (SQL)! 😎

  • J Livingston SQL (1/2/2012)


    Hi Jeff

    I am not surprised that you cannot replicate....I have used quirky update on several db's , all without errors...and the purpose behind this was to build a testrig for a fellow colleague who has just joined our team.....;-)

    I tested this code many times before I posted..just to be sure I could replicate it.

    I am wondering if this is a memory resource issue. ??...task manager on my "little" PC was running at about 95% when it threw the errror....and as I write, it has happened again.

    I know very little (sod all) about how to go about delving deeper into what is happening and where it is going wrong......have you any suggestions please?

    The

    I've only got 1GB and I've only given half of that to SQL. Task manager was pegged when I ran it on my older PC.

    I don't have much to offer on this problem because it won't replicate for me (or, apparemtly, anyone else, so far). I know I'm really stretching for an answer for you but a scan disk (not sure that's what they still call it) might be in order. That, and find someway to make sure there's no bad memory. I've seen some very strange things happen with bad memory... and, no, it doesn't always just jump out a you.

    Really stupid question but have to ask... are you sure that you actually built the CI on the table and that there aren't any other indexes on it? Any triggers?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • I can reproduce this (using the script exactly as written) on SQL Server 2008 R2 developer edition, build 10.50.2789 (SP1 + CU3, latest available) on Windows 7 Ultimate x64, with SQL Server max server memory set to 256MB:

    [font="Courier New"]Msg 8134, Level 16, State 1, Line 10

    Divide by zero error encountered.

    The statement has been terminated.[/font]

    The query plan is this:

    This has a number of interesting features, but the things that grab my attention are: the eager table spool (which will spill to physical tempdb since the server memory is set too low to allow the pages to stay in memory), the compute scalar (which defines the expressions used in the quirky update), the DMLRequestSort attribute on the Clustered Index Update (which should ensure rows are presented to the operator in clustered key order), the WithOrderedPrefetch attribute on the Clustered Index Update (related to read-ahead activity), and the predicate on the Clustered Index Update. The compute scalar and update predicate are particularly interesting:

    Compute Scalar:

    [Expr1005] = Scalar Operator([@StkLocProdRunTotal] =

    CASE

    WHEN [Expr1004]=CONVERT_IMPLICIT(bigint,[@RowCounter],0) THEN

    CASE

    WHEN [tempdb].[dbo].[StockMovements].[LocID]=[@LocID]

    AND [tempdb].[dbo].[StockMovements].[ProdID]=[@ProdID]

    THEN [@StkLocProdRunTotal]+[tempdb].[dbo].[StockMovements].[Qty]

    ELSE [tempdb].[dbo].[StockMovements].[Qty]

    END

    ELSE (1)/(0) END),

    [Expr1006] = Scalar Operator([@LocID] = [tempdb].[dbo].[StockMovements].[LocID]),

    [Expr1007] = Scalar Operator([@ProdID] = [tempdb].[dbo].[StockMovements].[ProdID]),

    [Expr1008] = Scalar Operator([@RowCounter] = [@RowCounter]+(1))

    The update predicate:

    Scalar Operator([@LocID] = [Expr1006],

    [@ProdID] = [Expr1007],

    [@RowCounter] = [Expr1008],

    [tempdb].[dbo].[StockMovements].[StkLocProdRunTotal] = RaiseIfNullUpdate([Expr1005]))

    Now, Compute Scalars do not generally perform the calculation at the point they appear in the plan; the expression defined by the compute scalar is generally evaluated later, when the first operator that needs it requests the expression result. Nevertheless, it does appear that the expressions (and sub-expression calls) ought to be performed in the correct order for quirky update purposes, so long as the Clustered Index Update predicate is guaranteed to evaluate its expression list in the order presented above. Crucially, the assignment of [Expr1008] to @RowCounter must occur before the result of [Expr1005] (which references both [Expr1004] and @RowCounter) is computed. Of course this sort of lack of guaranteed order of evaluation is exactly what the safety check is there for.

    Ok so that is all very interesting (and I mention it to make the point that things are more complex than they appear) but I'm pretty certain the key to the issue is that when the Eager Spool spills to disk, it no longer guarantees (if it ever did) to replay the rows in the order they were presented to the spool. This may be technically be a bug in that the query plan does assume sorted order is preserved (otherwise the DML Request Sort would add an explicit Sort). On the other hand, it may be expected behaviour - in which case the effect of rows not arriving in key order is that minimal logging is no longer possible (that's what DMLRequestSort is for) but otherwise things are fine.

    There is some evidence for my claims: the same instance configured with 2GB max server memory does not spill the spool (at the second attempt anyway - once SQL Server realises it has more memory now) and no error occurs in that case. Of course anyone running quirky update without the safety check won't get that error, just wrong results! Upgrade to SQL Server 2012 as soon as possible and use the new SUM OVER ORDER BY syntax 🙂

  • Interesting conclusions. I wonder why it runs just fine on a box with only 500 MB? I'll load it back up tonight after work and take a look at the execution plan.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden (1/3/2012)


    Interesting conclusions. I wonder why it runs just fine on a box with only 500 MB?

    Which exact versions of SQL Server are you running, and are you running 32-bit SQL Server or 64-bit? You said earlier that box 1 is 32-bit, but only that box 2 had a 64-bit CPU, not that it is necessarily running 64-bit SQL Server (it is perfectly possible to run 32-bit SQL Server on a 64-bit OS).

    Anyway, I do have some more information, having tested on 64-bit developer editions of SQL Server 2005, 2008, and 2008 R2. The versions are 9.0.5266 (2005 SP4 CU3 = latest available), 10.0.5768 (2008 SP3 CU2 = latest available), and 10.50.2789 (SP1 CU3 = latest available). The query plans are all the same, including the WithOrderedPrefetch attribute on the Clustered Index Update.

    I mention the prefetch attribute specifically because further testing has revealed that read-ahead plays an important role. If SQL Server has a large amount of memory, the update completes successfully whether read-ahead occurs or not. This is an example from an R2 run with 4GB available to SQL Server, and a cold data cache at the start of the query:

    Table 'StockMovements'.

    Scan count 1, logical reads 30947111, physical reads 16, read-ahead reads 46520

    Table 'Worktable'.

    Scan count 1, logical reads 30596864, physical reads 0, read-ahead reads 0

    However, on all tested versions of SQL Server, having a much smaller 'max server memory' setting (I used 256MB) results in an error every time - unless I disable read-ahead with DBCC TRACEON (652, -1). This suggests to me that something bad happens if read-ahead is invoked, but there is insufficient buffer pool space for it to get as far ahead as it would like. To be clear, the update always completes successfully (albeit slowly!) when read-ahead is disabled regardless of version or buffer pool size. I don't have 32-bit versions available to test I'm afraid.

  • Jeff Moden (1/2/2012)


    J Livingston SQL (1/2/2012)


    Hi Jeff

    I am not surprised that you cannot replicate....I have used quirky update on several db's , all without errors...and the purpose behind this was to build a testrig for a fellow colleague who has just joined our team.....;-)

    I tested this code many times before I posted..just to be sure I could replicate it.

    I am wondering if this is a memory resource issue. ??...task manager on my "little" PC was running at about 95% when it threw the errror....and as I write, it has happened again.

    I know very little (sod all) about how to go about delving deeper into what is happening and where it is going wrong......have you any suggestions please?

    The

    I've only got 1GB and I've only given half of that to SQL. Task manager was pegged when I ran it on my older PC.

    I don't have much to offer on this problem because it won't replicate for me (or, apparemtly, anyone else, so far). I know I'm really stretching for an answer for you but a scan disk (not sure that's what they still call it) might be in order. That, and find someway to make sure there's no bad memory. I've seen some very strange things happen with bad memory... and, no, it doesn't always just jump out a you.

    Really stupid question but have to ask... are you sure that you actually built the CI on the table and that there aren't any other indexes on it? Any triggers?

    defintely seem to have got some sort of memory issues...particularly when other apps are also open....will have to investigate.

    The CI is the only index and all tests run as per provided script.

    seems Paul has managed to replicate and got some detailed thoughts....

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • CELKO (1/4/2012)


    For running stats, we now have the OVER( [RANGE | ROWS] ..) sub-clause in the 2012 product.

    I am aware of the OVER( [RANGE | ROWS] ..) sub-clause in 2012...unfortunately it is very unlikely that we will migrate our systems to this version for several years....and therefore I havent even worried about testing it.

    I am obliged to produce several running stats calculations for various depts...all of whom want the results "now" :-D....and so far "quirky" has delivered.

    I believe that over the past couple of years I have (hopefully) kept abreast of all the additional testings and code requirements...particularly the safety check......and until just recently had never experienced a problem...(and believe me...we have tested/retested and tested again)

    So it transpires..that even though I somehow managed to screw up the memory on my PC which appears to have been the problem ...see the analysis by Paul above ...the quirky update didnt actually fail...it produced the desired result...an error.:-)..precisely what I need to ensure that I knew I had a problem.

    SO...until we have a budget that allows migration to 2012...I will continue with "quirky" for the tasks I use it for...feeling safe in the fact that, providing the 'safety check' is used..it will give me what I need........unless someone can provide a solution that is just as quick to run.???

    Kind regards and many thanks for the comments

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • CELKO (1/4/2012)


    The "quirky" update flaw is based on the old Sybase contigous storage model. The problem is that it is unpredictable; the data comes back in whatever is on the disk. Page splits, insertion order, defrags, locks from other users, etc. change that PHYSICALLY arrangement.

    This is why good SQL progreammers want to get this dropped from T-SQL and use the ANSI/ISO Standard MERGE statement instead.

    For running stats, we now have the OVER( [RANGE | ROWS] ..) sub-clause in the 2012 product.

    Heh... damned good thing the quirky update doesn't actually used the PHYSICAL arrangement. 😉 So far as having a FROM clause in UPDATE goes, not all "good SQL progreammers [sic]" want to see it be dropped. It's only unpredictable if you don't know what to expect.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • J Livingston SQL (1/4/2012)


    SO...until we have a budget that allows migration to 2012...I will continue with "quirky" for the tasks I use it for...feeling safe in the fact that, providing the 'safety check' is used..it will give me what I need........unless someone can provide a solution that is just as quick to run.???

    Just for my own curiosity, would you be able to give a brief summary of a typical requirement? Obviously I saw the example given in your code here, but I am curious to know whether you need to persist the running totals, what the broader use case description is...things like that. As I say, I'm just interested to know a bit more, I don't have a particular agenda.

  • Has anyone else managed to reproduce this now given the extra information I provided? Jeff in particular? I'm particularly interested to hear from anyone running 32 bit SQL Server with the reduced max server memory settings I mentioned.

Viewing 15 posts - 1 through 15 (of 55 total)

You must be logged in to reply to this topic. Login to reply