Restarting replication after non-fatal errors

  • We are running a very small but crucial DB locally that replicates changes in our stock control and accounting system to our remotely hosted web server. We have a 512k leased line and a rather messy 'evolved' rather than installed network. Nearly every week one or other of the transactional replications fails due to a network error (I am presumming it could not connect to the subscriber but could find no extra detail in the logs). If I resyncronise the push subscription everything starts up fine again. I am looking for a way to make the replication auto restart if it fails.

    Anyone have any ideas how to either solve these general network errors or to get the replication to auto restart.

    Thank you.

    Rolf

  • It normally restarts on it's own depending on the type of replication. If you will open EM and drill Management/Sql Server Agent/ Jobs you will find the actuall jobs that control this process. They are of catergory REPL-... and are usually named SERVERNAME-DBNAME-[SUBSCRIBER-]NUMBER. In these you will find the job steps which one is normally Run Agent. Look at the Advanced take of this step and you shoul see the retries and the retry interval. Adjust if they are being reached and the process completely ends.

    "Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)

  • Thanks for the prompt reply.

    The retry is set to 10 tries 1 minute apart. If I look at the subscription history it tells me it only tried 3 times..?

    Further investigation of the history shows me that there is a Transactional Process Deadlock occuring and then the general network errors occur. What is a deadlock and does it usually cause a replication to fail..?

    Thanks

  • A deadlock is when two or more items try to access resources the other needs to complete.

    For instance if a process is access table 1 and needs table 2 which is locked by another process which in turn is trying to access table 1 then you are in a deadlock. SQL then makes a decision after a period of time who gets be be the victim and kills their process. Deadlocks can cause any type of query mechanism to fail. You may want to trun on tracing or use profiler if this occurrs often and monitor what transactions are taking place. This may give you an answer as to what needs to be fixed.

    "Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)

  • It may well be that there are 2 push subscriptions replicating the same tables to 2 different servers.

    I have set up 2 publications to the 2 different servers then a push subscription for each. Is there a better way to avoid these deadlocks..?

    Thanks for all the help very elucidating.

  • If the publications are the same in setup then you should be able to create one publication and send to both subscribers. If not then if this is not transactional replication try to setup so times do not overlap. If transactional then then you may not have another solution if you cannot do the first other than monitor.

    "Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)

  • Probably worth looking into the deadlocks more. Shouldnt be replication, regardless of number of publications you're only running one log reader per db. It is common to have deadlocks in the distribution db if you have a large number of distribution agents running with a fairly high volume.

    You can use an alert to restart the job. Just because either log reader or distrib agent fails doesnt mean you have to snapshot - as long as you restart before the subscription expires (normally 72 hours) you're good.

    Andy

    http://qa.sqlservercentral.com/columnists/awarren/

  • They are both transactional replications, same tables (in fact identical) just to different servers.

    I have never set up an alert, what does it entail..? Looking in BOL now =).

    I could change one to a snapshot replciation at the start of each day but would prefer to have both as transactional.

    Thanks

    Rolf

  • I would try using one publication for both servers as it will direct the process better. Andy may be right thou, may not be the replication at all but if you can be sure then that is one less prossibility.

    "Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)

  • Just checked again and there is actually one article that is only replicated in one of the push subscriptions so I will need 2 publications.

    Checking out alerts. Which counter would I use..? The locks counter..?

    Rolf

  • Set up an alert

    Type = SQL Server event alert

    Error = 14151

    Database = YourDbThsiHappensOn

    On the response tab check Execute job, and pick the job related to the replication.

    "Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)

  • I apologise for my ignorance but I cannot find that type of alert.

    When I create a new alert I am prompted to select a performance object and then a counter. Neither of which have an object called the Server Event Alert.

    Any more info appreciated.

    Thanks

    Rolf

  • Open EM and drill down thru Management/SQL Server Agent/Alerts should be there. If this is what you are doing then what version of SQL are you using?

    "Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply