Very basic question - Insert or Update?

  • Well, personally, I feel that every suggestion always should be both 'reliable' and 'atomic' - what good would it be otherwise?

    Assuming that what we díscuss here is the single instance of 'update or inert', you do not need to enclose the process in an explicit transaction. In either case it's only one statement that will be executed, so it's atomic anyway. Should you have more than one table as part of the 'transaction', then it's time to do explicit trans. (but that's another story)

    > Next question: how is "update followed by a row check" better or worse than an "if exists followed by an update"?  In either case we will subsequently do an insert 50% of the time.

    Yes, and that's the point. We save that 'overhead' so to speak. It's 'better' to go for the update right away - 50% of the time, the work is then done. This is 'cheaper' than always do a) check, then b) insert or update.

    For the 'high concurrency' scenario, if you're worried that two simultaneous inserts might contend about the same new row, and you really want to try and squeeze every inch by avoiding 'unnecessary' queries - one should actually do the insert, while the other should do an update of that row, you could do it like this: (at the same time avoiding negative (ie not exists) searches)

    insert  myTab (id, val)

    select @id, @val

    from myTab t1

    left join myTab t2

    on t1.id = t2.id

    where t1.id = @id

    and t2.id is null

    select @err = @@error, @rc = @@rowcount

    if ( @err <> 0 ) goto errhandler

    if ( @rc = 0 ) -- row (id) already exists, to avoid PK violation, we need an update instead

      begin

       update myTab

       set val = @val

       where id = @id

      end

    Admittedly, the above is a bit harder to understand, but essentially what we're doing is that we go for the INSERT first - and to avoid an eventual PK violation, the insert is based on a left self join, testing for the existence of the PK. If it exists, the insert itself will return 0 rows, so we then know by looking at @@ROWCOUNT that if we should do an update or not. Shold we enter the described scenario that two simultaneous connections try to insert the same PK, (because both have by some method retrieved the same PK value, no matter how), only one at a time can actually attempt the insert operation - SQL Server itself will see to that. Now, I haven't tested this down to the milliseconds (considering rowlocks and all might actually allow to exact simultaneous inserts anyway), though that is probably unlikely.. You need that to happen, and also the situation of more than one actually trying to add a new row with the same new PK for it to break...

    The critical part here is what goes into the ON and WHERE clauses of the left join. If it's not right, it won't work as expected.

    And, ultimately - it all depends

    If the difference is measurable, or even worth the effort will of course depend. I just wanted to show another 'angle' to the 'problem'.

    /Kenneth

     

  • This doesn't include the Else...Insert statement, and this is in reference to summarizing about 50,000 rows every night, but here is my question:

    If I choose the IF EXISTS method (then update), am I not evaluating the WHERE clause in the Update statement anyway?  

    -- Create temp table to hold views data

     If Object_Id('tempdb..#tmpFactViewsData') is Not Null

     DROP TABLE [dbo].[#tmpFactViewsData]

     

     CREATE TABLE #tmpFactViewsData

     (

     RptYear smallint,

     RptMonth smallint,

     LanguageCode nvarchar(2),

     CountryCode nvarchar(2),

     ApplicationID int,

     ProductRangeID int,

     OSID int,

     ProductID int,

     UserGroupID int,

     RptCount int

    &nbsp

     INSERT INTO #tmpFactViewsData

     SELECT  Year(FCD.ViewDate) AS RptYear, Month(FCD.ViewDate) AS RptMonth, FCD.LanguageCode, FCD.CountryCode, FCD.ApplicationID, FCD.ProductRangeID, FCD.OSID, FCD.ProductID, UserGroupID, COUNT(*) AS RptCount

     FROM FactCommonDaily FCD

     WHERE FCD.Treated = 1

     GROUP BY Year(FCD.ViewDate), Month(FCD.ViewDate), FCD.LanguageCode, FCD.CountryCode, FCD.ApplicationID, FCD.ProductRangeID, FCD.OSID, FCD.ProductID, UserGroupID

     IF EXISTS(SELECT FVD.RptYear, FVD.RptMonth, FVD.LanguageCode, FVD.CountryCode, FVD.ApplicationID, FVD.ProductRangeID, FVD.OSID, FVD.ProductID, FVD.UserGroupID

       FROM FactViewsData FVD, #tmpFcatViewsData tFVD

       WHERE FVD.RptYear = tFVD.RptYear

        AND FVD.RptMonth = tFVD.RptMonth

        AND FVD.LanguageCode = tFVD.LanguageCode

        AND FVD.CountryCode = tFVD.CountryCode

        AND FVD.ApplicationID = tFVD.ApplicationID

        AND FVD.ProductRangeID = tFVD.ProductRangeID

        AND FVD.OSID = tFVD.OSID

        AND FVD.ProductID = tFVD.ProductID

        AND FVD.UserGroupID = tFVD.UserGroupID)

     -- If EXISTS then update table and add to the RptCount

     UPDATE FactViewsData FVD

     SET FVD.RptCount = FVD.RptCount+#tmpFactViewsData.RptCount

     WHERE FVD.RptYear = tFVD.RptYear

        AND FVD.RptMonth = tFVD.RptMonth

        AND FVD.LanguageCode = tFVD.LanguageCode

        AND FVD.CountryCode = tFVD.CountryCode

        AND FVD.ApplicationID = tFVD.ApplicationID

        AND FVD.ProductRangeID = tFVD.ProductRangeID

        AND FVD.OSID = tFVD.OSID

        AND FVD.ProductID = tFVD.ProductID

        AND FVD.UserGroupID = tFVD.UserGroupID

    I'm not seeing the value of the IF EXISTS statement....what am I missing????  Thanks.

  • Annoying that a closing parenthesis by itself turns into a winking smiley...

    RptCount int

     

  • No, it's quite unnecessary.

    If the update was attemped straight off, you could always check @@rowcount for how many rows was affected by the update. There is no error if rowcount is zero, so the EXISTS check doesn't add any value here.

    /Kenneth

Viewing 4 posts - 16 through 18 (of 18 total)

You must be logged in to reply to this topic. Login to reply