Stopping Duplicates with INSTEAD OF INSERT Problem

Question

Post reply

Stopping Duplicates with INSTEAD OF INSERT Problem

Viewing 8 posts - 31 through 37 (of 37 total)

You must be logged in to reply to this topic. Login to reply

Paul White SSC Guru Points: 150457 More actions · Answer 1

Daniel C (4/1/2010)

The problem should be easily solved with a MERGE statement inside your trigger. I had a similar task to solve and it's been taken care of very elegantly using that merge.

No chance for a duplicate to come in.

Two things on that:

1. You code example relies on running at the SERIALIZABLE isolation level.

2. In general, MERGE is not bullet-proof. See UPSERT race condition with MERGE by MVP Dan Guzman.

Running everything at SERIALIZABLE is a pretty poor choice for concurrency.

Not saying it has no merit - I am a big fan of MERGE - just be careful when making absolute statements.

Paul

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

JJ Duru Right there with Babe Points: 715 More actions · Answer 2

Two things on that:

1. You code example relies on running at the SERIALIZABLE isolation level.

2. In general, MERGE is not bullet-proof. See UPSERT race condition with MERGE by MVP Dan Guzman.

Running everything at SERIALIZABLE is a pretty poor choice for concurrency.

Regarding the concurrency while in SERIALIZABLE isolation level, you might wanna use this:



ALTER DATABASE AdventureWorks

    SET ALLOW_SNAPSHOT_ISOLATION ON;

I'm not a fan of row locking either. And from my perspective, this is a huge shortcoming from SQL Server's part; Oracle's coming with the so called allow_snapshot_isolation enabled by default.

Dan Guzman's proof of concept is using WITH(HOLDLOCK); I am using WITH(SERIALIZABLE) which according to MS these two are equivalent.

http://msdn.microsoft.com/en-us/library/ms187373.aspx

Add in SET ALLOW_SNAPSHOT_ISOLATION ON and suddenly you're improving the concurrency.

By the way, I constructed my trigger following Dan Guzman's article. Thanks for the link though.

Paul White SSC Guru Points: 150457 More actions · Answer 3

Daniel C (4/1/2010)

Regarding the concurrency while in SERIALIZABLE isolation level, you might wanna use this:

ALTER DATABASE AdventureWorks SET ALLOW_SNAPSHOT_ISOLATION ON

But then of course you have to explicitly set the SNAPSHOT isolation level and deal with any update conflicts.

READ_COMMITTED_SNAPSHOT is frequently an easier solution.

I'm not a fan of row locking either. And from my perspective, this is a huge shortcoming from SQL Server's part; Oracle's coming with the so called allow_snapshot_isolation enabled by default.

I am not sure what you are saying here. There's nothing wrong with 'row locking'. Each isolation level has different characteristics, and each is designed for a different set of requirements.

Dan Guzman's proof of concept is using WITH(HOLDLOCK); I am using WITH(SERIALIZABLE) which according to MS these two are equivalent.

They are synonyms. My point was that your MERGE example works because it uses exactly the same locking semantics I described earlier. There is essentially no difference between the two - except MERGE is only available from SQL Server 2008.

Add in SET ALLOW_SNAPSHOT_ISOLATION ON and suddenly you're improving the concurrency.

No such thing as a free lunch, and full snapshot is a long way short of being free.

See Choosing Row Versioning-based Isolation Levels and Using Row Versioning-based Isolation Levels for the long list of caveats and restrictions.

By the way, I constructed my trigger following Dan Guzman's article. Thanks for the link though.

Good. Can I ask why the explicit transaction though?

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

david_wendelken SSCrazy Points: 2388 More actions · Answer 4

I think the problem was caused by a two-step algorithm with interleaved processes.

Process stream A:

a1. Check for duplicate

a2. Insert new record.

Process stream B:

b1. Check for duplicate.

b2. Insert new record.

Now, let's interleave them:

a1. Check for duplicate. Passes.

b1. Check for duplicate. Passes because process a hasn't inserted YET.

a2. Insert new record.

b2. Insert new record. There's a duplicate.

I know this may sound silly, but why not put a unique key index on the GUID column?

The code is simple to understand and maintain by anyone who has basic sql knowledge and the duplicate is stopped.

Paul White SSC Guru Points: 150457 More actions · Answer 5

david_wendelken (4/2/2010)

I know this may sound silly, but why not put a unique key index on the GUID column?

Well, yes. But:

I'd prefer to use a unique constraint on the table, but we are partitioning the table and we partition on the created time field, and since the partition key id needs to be in every index (for our archiving purposes) it doesn't work since the time is different between the two records.

That information is way back on the first page (original post).

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

Paul White SSC Guru Points: 150457 More actions · Answer 6

david_wendelken (4/2/2010)

a1. Check for duplicate. Passes.

b1. Check for duplicate. Passes because process a hasn't inserted YET.

a2. Insert new record.

b2. Insert new record. There's a duplicate.

Exactly. That's why a HOLDLOCK/SERIALIZABLE hint is required - to lock the range of values checked (even though no row exists to lock) to guarantee that the INSERT succeeds, and avoid the race condition.

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

JJ Duru Right there with Babe Points: 715 More actions · Answer 7

Good. Can I ask why the explicit transaction though?

Because SQL Server will roll back only the unsuccessful statements bot not the ones that succeeded, in a multi-statements transaction. This is the default behaviour.

I need all them to run a "all or nothing" basis.

I realized that I posted a slightly old version of the trigger, which in fact should include SET XACT_ABORT ON at the top of the trigger body.

Paul White SSC Guru Points: 150457 More actions · Answer 8

Daniel C (4/2/2010)

Because SQL Server will roll back only the unsuccessful statements but not the ones that succeeded, in a multi-statements transaction. This is the default behaviour. I need all them to run a "all or nothing" basis.

Trigger code always runs inside a transaction - the implicit transaction associated with the statement that fired the trigger. Trigger code is therefore always atomic - either the triggering statement and the trigger code complete successfully as a unit, or the whole thing fails and is rolled back. See Rollbacks and Commits in Stored Procedures and Triggers.

If you check @@TRANCOUNT at the start of a trigger, you will see that it is always at least 1. It is therefore redundant to include an explicit transaction in a trigger, since it is always a nested transaction. I just wondered if you appreciated that or not.

Paul

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi