Transactional Replication - Distributor won't pick up a snapshot?

  • Hello!  I'm currently having an issue with transactional replication on some of our servers, and it's got me a bit puzzled.

    I need to reinitialize the publication, as there's been some changes on the subscriber that need to be wiped out and re-synchronized with the publisher.  To do so, I've marked the publication for reinitialization, run the snapshot agent, then the log agent, then the distributor.

    However, the distributor won't pick up the snapshot from the snapshot agent when it runs again; it instead tries to send over an erroneous command (updating a column that doesn't exist on the subscriber).  I've repeated the process in a few different variation, but it always fails this same way.

    Digging through Google for some similar cases hasn't lead me to anything just yet, though I'll keep trying.  It feels like this should be a pretty simple process to run through, so I think I'm probably missing something big here.  Please let me know if there's anything I can provide to make this clearer!

  • Otterfate - Tuesday, May 9, 2017 6:22 PM

    Hello!  I'm currently having an issue with transactional replication on some of our servers, and it's got me a bit puzzled.

    I need to reinitialize the publication, as there's been some changes on the subscriber that need to be wiped out and re-synchronized with the publisher.  To do so, I've marked the publication for reinitialization, run the snapshot agent, then the log agent, then the distributor.

    However, the distributor won't pick up the snapshot from the snapshot agent when it runs again; it instead tries to send over an erroneous command (updating a column that doesn't exist on the subscriber).  I've repeated the process in a few different variation, but it always fails this same way.

    Digging through Google for some similar cases hasn't lead me to anything just yet, though I'll keep trying.  It feels like this should be a pretty simple process to run through, so I think I'm probably missing something big here.  Please let me know if there's anything I can provide to make this clearer!

    What erroneous commands, errors, failures are you seeing? Are the correct jobs enabled/running on all the servers? You typically don't have to start things yourself so do you run all jobs manually - no log readers running continuously? 
    When you say you ran the snapshot agent then the log agent - where did you run these from? And then the distributor - what did you run at this part? 

    Sue

  • I'm seeing failures like "Error when attempting to update table

    .  The table does not exist.".  Paraphrased a bit.  Basically, the distributor is trying to send over an update to a table that no longer exists on the subscriber.

    I've got the distribution agent set up on the publishing server, set to run every 10 minutes, and the log reader agent, set to run every 10 minutes.  The Snapshot Agent is only run on demand, not on a schedule.

    When I ran the agents, I went to the Replication Monitor, double-clicked the publication in the All Publications page, went to Actions, and clicked Reinitialize Subscription.  I then chose to reinitialize from a new snapshot.  Next, I went to the Agents page, right-clicked the Snapshot Agent, and clicked Run Agent.  Then, once it finished making the snapshot, I went back to the main All Subscriptions page, double-clicked the publication, went to Actions, and clicked Start Distribution Agent.  The status window starts outputting messages, and almost immediately hits the aforementioned failure.

    It's like the Distribution Agent just doesn't want to use the snapshot that's been made; quite puzzling!

  • Otterfate - Wednesday, May 10, 2017 7:49 PM

    I'm seeing failures like "Error when attempting to update table

    .  The table does not exist.".  Paraphrased a bit.  Basically, the distributor is trying to send over an update to a table that no longer exists on the subscriber.

    I've got the distribution agent set up on the publishing server, set to run every 10 minutes, and the log reader agent, set to run every 10 minutes.  The Snapshot Agent is only run on demand, not on a schedule.

    When I ran the agents, I went to the Replication Monitor, double-clicked the publication in the All Publications page, went to Actions, and clicked Reinitialize Subscription.  I then chose to reinitialize from a new snapshot.  Next, I went to the Agents page, right-clicked the Snapshot Agent, and clicked Run Agent.  Then, once it finished making the snapshot, I went back to the main All Subscriptions page, double-clicked the publication, went to Actions, and clicked Start Distribution Agent.  The status window starts outputting messages, and almost immediately hits the aforementioned failure.

    It's like the Distribution Agent just doesn't want to use the snapshot that's been made; quite puzzling!

    Did you check for errors with any of the jobs?
    Did you check the synchronizations for the publication - Publisher to Distributor, Distributor to Subscriber? And do you have undistributed commands?
    Did you check for any errors with any of the agents? Keep track of the xact_seq_no if there are any.

    You may also want to query MSrepl_errors in the distribution database and again keep track of the xact_seq_no if there are any.
    If you have any errors with the agents if you have any xact_seq_no with errors, you can check more details on this by passing the transaction number into sp_browsereplcmds to see what was going for that transaction. Use that xact_seq_no for the start and end - along the lines of:
    EXEC SP_BROWSEREPLCMDS @xact_seqno_start = '0x00000027000000B50008',@xact_seqno_end = '0x00000027000000B50008'

    The command column will have more information. You can also find more information in the MSdistribution_history table in the distributor. Look for 'Distributed snapshot..' in the comments column and then check what happened after that. You can filter the queries by start_time to limit the volume and find a time just before the reinitializing.

    It sounds like you may have undistributed commands that happened before the reinitialization that would no longer be valid commands. How often are you running the agents as it sounds like they aren't continuous but on a schedule? You may want to consider dropping those subscriptions and recreating them as it seems things may be fairly out of synch - especially if the publications you are reinitializing are the ones with the errors for unknown tables, columns.

    Sue

  • First off, you should avoid making changes to tables on the subscriber side in the first place - make your changes on the publisher and they should replicate automatically as long as they don't require the table to be recreated.  So for example, if you add a column to a publisher table, it should appear on the subscriber side as well as long as it's at the end of the table.

    That being said, what you probably need to do at this point is go into the properties of the publisher, go into Articles, and for each article (table) you know has changed, go into the properties of the article and look in the Destination Object section.  Look for a setting called "Action if name is in use" and tell it to drop and recreate the object.

    Then reinitialize the whole subscription.  It may take a while depending on how much data you are replicating, but it sounds like you have unfinished transactions and frankly, it'd probably take longer to resolve things manually.

  • Ah!  After some digging into the whole thing, I found the problem.  A very simple one, and it's entirely my fault for not doing more research here!

    The problem was the parameter @sync_type; we had it set to 'replication support only' for this publication.  However, that requires a drop and create on the publication in order for any schema changes to be picked up, and the publication will silently refuse to use any snapshots you create for it.

    We changed the @sync_type to 'automatic', and things are working fine now!  I do realize that changes at the subscriber should ideally not be happening, but, unfortunately, our developers kind of work wherever they would like at any given time, so they'll do development work on the subscriber as well.  A bit of a bother, but at least I know how to get it ironed out now!

    Thank you for your help!  It got me pointed in a different direction than I had been looking originally, and that lead me to a solution.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply