Log Shipping Performance Problems

  • Hi guys,

    I'm an accidental DBA that is looking after a MSSQL Server running about 600 small DB's. The databases are mostly read intensive, probably about 80/20 reads/writes.

    We are trying to implement Log Shipping for all the DB's to a server in the same datacentre (Secondary A) and also another server in separate datacentre (Secondary B). Secondary B also acts as the monitor server for both secondaries so that we have all of the relevant info in case we have to recover from a disaster. I have written stored procs that enables everything on all the servers and the setup side is all working well.

    The problem is that once we got to about the 350-400 DB mark we have started to have major problems with performance. The Secondary B server is freezing and doing all sorts of weird things and we are receiving constant emails saying that the Log Shipping is taking to long to restore, copy AND backup. It seems to be caused by all the IO on the MSDB database on Secondary B. All of the servers are writing info to it for the monitoring.

    What I really wanted to gauge from this post is, are we completely crazy for ever thinking that we can do what we are trying to do?

    We never have any issues with our main server and although we are bringing a second server online now, and will be splitting the load between them, the Primary server still has plenty of resources to handle another hundred or so databases. I thought Log Shipping was a fairly low impact HA method and so far it hasn't seemed affect the performance of the primary server. But it seems that the monitor server is just not coping. Granted, it does have slower drives and less RAM, but we assumed that it wouldn't need to be as powerful as the primary server as all it was doing is storing some passive copies of our databases and some info on when they were last backed up.

    As it stands, it looks like we would need to make the Secondary B server just as powerful as the primary server and/or have multiple DR Secondary servers and share the load across them. Either way, we'd be paying more for our backup servers than our Live servers which is just crazy.

    Are we expecting too much from SQL?

    Look forward to your thoughts.

    Regards,

    Lukas

  • I think logshipping probably isnt the best soltuion for the volume of databases you are trying to restore. 600 databases is a massive administrative over head.

    Do the databases need to be in standby mode or could you implement a disk mirroring solution such as doubletake for this?

  • In case you want warm stand-by\Hot stand-by server then you need to have same capability on secondary server as the primary one if you go for mirroring or log shipping.

    If you just wanted a cold server where you just store data, you ship just daily backups & transaction logs backups to a file server and can you use to restore it after the primary fails. This solution is in place for businesses that can afford downtime of more than 2 hours.

    -cheers

    Sat

    Cheer Satish ๐Ÿ™‚

  • In logshipping we can check the netwrok bandwidth raio and time taking to copy from one disk to target disk.it depends on backup issues and performance problems in the sql servers.

  • I would try to combine all the Log Shipping Jobs in on Job.

    The result should be that one Job after an other is executed and they donโ€™t run parallel. Also set the Job Schedule so that one run is finished before starting the next.

    This also should prevent you from performance Problems, if you do it for all three Job Types.

    Also increase the Time for warnings depending on how long it takes for the Jobs to run.

    The disadvantage is that the data are more behind on the Backup Server.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply