Invalid IN costruct ignored

  • I had a table (tblStop) containing a list of serial numbers, which were a subset of records in another table (tblCard) and was asked to set a flag and date on records in tblCard where the serial number existed in tblStop. There were 150 records in tblStop, and 65,000 records in tblCard

    I created this update query:

    UPDATE tblCard

    SET Stopped = 1, StoppedDate = Getdate()

    WHERE tblCard.SerialNumber IN (Select SerialNumber from tblStop)

    When I ran the query, it reported 65,000 records updated!

    I then realised that the column in tblStop was called CardSerialNumber, not SerialNumber

    I would have expected an error to be generated, as it would if I just ran Select SerialNumber FROM tblStop, but no error, and every single record in the target table was updated! The first words out of my mouth were Oh 5h1t! Luckily I was able to set the records back to their previous status using a backup, but I was shocked when it happened.

    Can anyone explain why it happened?

  • This is a gotcha with correlated subqueries.

    The tblCard table is implicitly included in your outer select (tblStop). Because SQL Server can't find SerialNumber in tblStop, it has a look in the inner table and finds SerialNumber in tblCard.

    Change your query to this, and it will generate the error you were expecting

    UPDATE tblCard

    SET Stopped = 1, StoppedDate = Getdate()

    WHERE tblCard.SerialNumber IN (Select tblStop.SerialNumber from tblStop)

    Somewhere (can't find it at the moment), I've seen an article explaining this a lot better than I have.

  • I tried it on one of our databases and it did not work. I got an error. This was on a SQL 2008 server.

    The only thing I would think off, if this happened to me, would be that maybe I selected the first 2 lines and left out the WHERE clause.

  • A practice that comes in handy:

    When doing any sort of manual update in a production database, wrap it in a transaction, and don't commit till you know the results will be correct.

    If you can't afford to have an uncommitted transaction (because of locks), then roll it back when you're validating it.

    The Output clause (introduce in SQL 2005) can make it very, very easy to verify exactly what will be done by the command before committing it.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • sometimes using the UPDATE FROM syntax can keep you from fatfingering the correlated subquery like that; i know i did that same error to an invoices table a long time ago, marking everything as "shipped"

    ouchy.

    UPDATE tblCard

    SET Stopped = 1, StoppedDate = Getdate()

    FROM tblStop

    WHERE tblCard.SerialNumber = tblStop.SerialNumber --this will error out and you'd change it to CardSerialNumber

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • This wasn't the article I was thinking of, but is Microsoft's explanation:-

    http://msdn.microsoft.com/en-us/library/ms178050(SQL.90).aspx

  • Hi

    A very common error (I learned, too). UPDATE is the most dangerous statement in SQL in my opinion (something I tell people whenever I conduct a class). It's a psychological reason. Words like DELETE, TRUNCATE or even DROP tell you "I can corrupt your data", UPDATE sounds not really dangerous but it can corrupt a complete table, too.

    I use a little trick to reduce the possibility of those errors. (BTW: I always try to avoid sub-queries if possible). Using the joined-UPDATE syntax makes it quiet simple to test statements.

    --UPDATE c SET Stopped = 1, StoppedDate = Getdate()

    SELECT *

    FROM tblCard c

    JOIN tblStop s ON c.SerialNumber = s.SerialNumber

    So I can develop without any dangerous. When everything looks fine, just comment the "SELECT *" and uncomment the "UPDATE".

    Greets

    Flo

  • As everyone else has already stated - this is a known situation. Sub-queries have access to all of the columns from the tables in the outer query where the sub-query is referenced. If this was not possible, we would not have the ability to perform correlated sub-queries.

    Two practices that will help you avoid these kinds of issues in the first place are:

    1) When performing ad-hoc updates or deletes, wrap the code in a transaction.

    2) Alias every single table being referenced - always...

    For the first item - example would be:

    BEGIN TRANSACTION;

    -- perform update or delete here

    ROLLBACK TRANSACTION; -- after testing to make sure everything works

    -- COMMIT TRANSACTION; -- un-comment this line and comment out previous line

    For the second item - example would be:

    UPDATE MyTable

    SET somecolumn = 'some value'

    WHERE othercolumn IN (SELECT othercolumn FROM othertable ot);

    Now, just because the othertable is aliased now - this will not cause the query to fail. The update would still succeed because the othercolumn can still be referenced from MyTable. Modify it to:

    UPDATE MyTable

    SET somecolumn = 'some value'

    WHERE othercolumn IN (SELECT ot.othercolumn FROM othertable ot);

    And this one will immediately fail because othertable does not have othercolumn.

    Jeffrey Williams
    Problems are opportunities brilliantly disguised as insurmountable obstacles.

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Many thanks to all concerned for your input

  • While we're all chipping in with tips and tricks...

    If you only expect to update one row, say so: UPDATE TOP (1)

    Yes, wrapping things in a transaction and rolling back as a test is good...but identity columns can trip you up.

    I run with IMPLICIT_TRANSACTIONS ON, in case I forget. No lectures please: it works for me.

    I also prefer Flo's approach of SELECTing the (TOP n) records which will be affected before adding the UPDATE/DELETE/INSERT.

    I would like to see the syntax that allows the original error deprecated, in favour of requiring aliases, just like with table variables.

    Paul

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply