July 11, 2012 at 3:54 pm
Hi,
I'm running a SQL 2005 publication that is partitioned and merge replicated to 354 separate subscribers. Each partition is a unique subset of the master data.
Recently I have started having very slow replication performance. I am seeing a lot of failures, with errors such as:
"The merge process was unable to create a new generation at the 'Publisher'. "
"The merge process was unable to change generation history at the 'Publisher'."
"The merge process was unable to access row metadata at the 'Subscriber'.
When I look at the longest running transactions on the server, I'm seing huge lock numbers for some transactions. At the minute I have a single subscriber holding 1.3m locks across a range of tables (OBJECT/PAGE with Intent Exclusive, KEY with Exclusive). It is holding an open transaction that has been running for 6 hours. Usual replication times are from 10 minutes to 2 hours. A profiler trace on the deadlock graph and lock acquired/cancel/released/deadlock/deadlock chain events shows nothing untoward, just the usual chatter of acquired/released.
Checking the subscriber via Replication Monitor shows that it is hanging on "Upload 99% complete". This is a common feature for the long running replications. They seem to get almost there, then hang for ages, then error with one of the messages above.
From what I've read around the web, I'm thinking this might be a blocking issue? Specifically contention on the replication system tables. Perhaps related to a mismatch between msmerge_genhistory and msmerge_contents? See threads below as examples:
All merge agents are on a slow link profile
Agent query timeout is at 65534
generation_levelling_threshold is set to 0 at both subscribers and publisher
Publisher config from sysmergepublications:
publisher XXX
publisher_db XXX
name XXX
description XXX
retention 18
publication_type 1
pubid 8E72F963-2C85-4005-BC5F-8494A1BE23BE
designmasterid 8E72F963-2C85-4005-BC5F-8494A1BE23BE
parentid 8E72F963-2C85-4005-BC5F-8494A1BE23BE
sync_mode 0
allow_push 1
allow_pull 1
allow_anonymous 0
centralized_conflicts 1
status 1
snapshot_ready 1
enabled_for_internet 0
dynamic_filters 1
snapshot_in_defaultfolder 0
alt_snapshot_folder XXX
pre_snapshot_script XXX
post_snapshot_script XXX
compress_snapshot 0
ftp_address NULL
ftp_port XXX
ftp_subdirectory NULL
ftp_login ANONYMOUS
ftp_password NULL
conflict_retention 18
keep_before_values 0
allow_subscription_copy 0
allow_synctoalternate 0
validate_subscriber_info HOST_NAME()
ad_guidname NULL
backward_comp_level 90
max_concurrent_merge 0
max_concurrent_dynamic_snapshots 0
use_partition_groups 1
dynamic_filters_function_list HOST_NAME()
partition_id_eval_proc MSmerge_evalpartid_sp_8E72F9632C854005
publication_number 1
replicate_ddl 1
allow_subscriber_initiated_snapshot 1
distributor XXX
snapshot_jobid 0x31B5B507C7084542BC1BF618D874FDAF
allow_web_synchronization 0
web_synchronization_url NULL
allow_partition_realignment 1
retention_period_unit 0
decentralized_conflicts 1
generation_leveling_threshold 0
automatic_reinitialization_policy 0
Subscriber @@version: Microsoft SQL Server 2005 - 9.00.4053.00 (Intel X86) May 26 2009 14:24:20 Copyright (c) 1988-2005 Microsoft Corporation Express Edition with Advanced Services on Windows NT 5.1 (Build 2600: Service Pack 2)
Publisher @@version: Microsoft SQL Server 2005 - 9.00.4262.00 (X64) Aug 13 2009 17:06:39 Copyright (c) 1988-2005 Microsoft Corporation Standard Edition (64-bit) on Windows NT 5.2 (Build 3790: Service Pack 2)
I'm really stuck here, any advice on a strategy for investigating these problems would be hugely appreciated. You may guess that I'm not (usually) a DBA.
Thanks, Iain
July 16, 2012 at 2:02 pm
Not sure if this will help but found these on SP3 fixes today - not sure on versions of your subscriber.
http://support.microsoft.com/kb/955706
959023 (http://support.microsoft.com/kb/959023/ ) FIX: A blocking issue occurs when you run the Merge Agent in SQL Server 2005
959024 (http://support.microsoft.com/kb/959024/ ) FIX: When the Merge Agent synchronizes multiple batches of changes, the synchronization may take a long time to finish or the synchronization may expire unexpectedly in SQL Server 2005
Version info:
SQL Server 2005 Service Pack 4 9.00.5000.00
SQL Server 2005 Service Pack 3 9.00.4035
SQL Server 2005 Service Pack 2 9.00.3042
SQL Server 2005 Service Pack 1 9.00.2047
SQL Server 2005 RTM 9.00.1399
July 16, 2012 at 2:10 pm
That looks very interesting Jamie, thanks.
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply