Transactional Replication - Unitialized subscription

  • I've created a new subscription, it's pushing out to a shared directory. I see the snapshot job runs and I see the snapshot files on the share, I see the log reader job conintuously running, and I see the distributor job running on the subscriber, however the snapshot doesn't ever seem to be delivered. I drop in a tracer token and I get publisher to distributor times of 2 seconds, but it looks like the distributor is never picking up.

    I've verified that the subscriber is configured to the right snapshot location. Past that I have no idea what's going on. Can anyone give me a good way to check what's going on and why it isn't initializing?

  • I hope you have checked the replication monitor?

    M&M

  • Yes. that's what's reporting uninitialized subscription.

  • Did you check the job history for the distribution agent? It is probably retrying and every couple of minutes, but it will have an initial step that did not complete, step 2 most likely. It should have a message for you to research further.

  • It's been running successfully for days. No error, but the "uninitialized subscription" and not picking up the files.

  • I think that I might re-initialize the subscription and run the snapshot again, unless it's really too big to do that. You should see some message in the distribution history if there is an issue applying the snapshot.

  • I've reinitialized, but the snapshot just sits there like a bump on a pickle, and the distribution job keeps running. I've also tried recreating the subscription, and removing the publication and recreating it. No dice.

  • You could enable logging for the agents (snapshot agent and distribution agent) and see if anything more verbose is logged which could help pinpoint the issue.

    Follow the steps listed here to enable logging for the agents.

  • I think I've found my problem. My server seems to cap out at 96 continuously-running distribution jobs, once I attempt to set up the 97th it won't initialize the subscription, unless I stop another running job.

    96 seems like a strange number, so I'm guessing it's a function of memory, hard disk space, network resources, processing power, etc. Does anyone know where I can find the magic formula for how many of these jobs I can have running at a time? I don't want to have us get more memory only to find out network resources are the limiting factor, etc.

    Thanks in advance,

  • I think I've found my problem. My server seems to cap out at 96 continuously-running distribution jobs, once I attempt to set up the 97th it won't initialize the subscription, unless I stop another running job.

    There are several different things to look at. First check your windows event logs and see if there are any issues such as agent jobs not starting. Check both application and system. this will be the first clue but by the number of jobs you just provided I suspect that you are hitting limitations in 2 areas. The first is in the SQL Server Subsystem settings. The 2nd will be in the desktop heap.

    I currently am working on a centralized distributor which has over 650 subscriptions and climbing. I hit both issues. MS was less than forthcoming with some of the information I wanted. But, I now have everything working. But my first indication of a problem was that the jobs were not working properly because of these issues.

    Steve Jimmo
    Sr DBA
    “If we ever forget that we are One Nation Under God, then we will be a Nation gone under." - Ronald Reagan

  • Seth

    the magic formula for how many of these jobs I can have running at a time? I don't want to have us get more memory only to find out network resources are the limiting factor, etc

    Additionally, there is no magic formula. You have to experiment.

    Fisrt though, what type of replication are you setting up, what version of SS is the publisher as well as the subscribers?

    Where is your distributor, and what version of SS. What kind of machine is it on? How many CPU and how much memory? Replication systems can be resource hogs so you have to ensure that the server is not going to die.

    Steve Jimmo
    Sr DBA
    “If we ever forget that we are One Nation Under God, then we will be a Nation gone under." - Ronald Reagan

  • I would have assumed it was an "experiment" thing except that it seems to be a hard cap. I would expect performance to steadily degrade (more latency etc until it just gives up the ghost entirely). This seems to be *bang* 97th job and "Uncle!"

    We're using transactional replication with pull subscriptions (single subscriber) so the distributor is also the subscriber (which probably isn't helping things) and runs continuously. I'm just anticipating going to management with a vague "we need more resources" and being clobbered when I can't answer what it is that we need. I'll check into the eventlogs and see if I can divine something from it, but if I recall from when I was having the uninitialized subscription issue there wasn't anything out of the ordinary. SQL seemed to report everything was fine, just wouldn't suck the darn data in.

  • Seth,

    Run the following against your distributor in the msdb database:

    SELECT subsystem, agent_exe, max_worker_threads

    FROM syssubsystems

    WHERE event_entry_point = 'ReplEvent'

    This will tell you the configuration for your various replication agents. My understanding is that these settings are created when SQL is installed initially. It is also my understanding that when you increase the resources, that these configurations do not change.

    Additionally, you will need to look at a registry setting on your server (again the distributor) HKLM\System\CurrentControlSet\Control\Session Manager\SubSystems\Windows

    It will look something like this: %SystemRoot%\system32\csrss.exe ObjectDirectory=\Windows SharedSection=1024,20480,7168 Windows=On SubSystemType=Windows ServerDll=basesrv,1 ServerDll=winsrv:UserServerDllInitialization,3 ServerDll=winsrv:ConServerDllInitialization,2 ServerDll=sxssrv,4 ProfileControl=Off MaxRequestThreads=16

    You are interested in the SharedSection which is highlighted. The 3rd set of numbers is the one which is used for the agents running. What MS had said was that you can increase this number by 512 at a time and test it. (Don't use mine as this is for a large box dedicated to being a distributor.)

    Yes, the distruibutor can be a resource hog based upon the number of agents running. I do not know what size your distributor is or how many publications/subscribers there will be but in my case it was a major pain to get to where we are now. We are running a box with Win 2K8 with 16 CPU and 32GB of memory as our central distributor. It has 12 publications and 632 subscriptions currently. When finished we will be over 1000 subscriptions with over 30 publications. The majority of our subscriptions are push. Our publishers are SS2K5 with the distributor of SS2K8. Our subscribers are a mix of 2K and 2K5. I anticipate increasing the memory to probably double. My latency over T1's is approx 3ms.

    I started with a 4CPU server and 24GB memory on a server which houses 2 databases. I killed that box in no time.

    Good luck

    Steve Jimmo
    Sr DBA
    “If we ever forget that we are One Nation Under God, then we will be a Nation gone under." - Ronald Reagan

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply