Replicating large tables

  • We have setup a transactional replication from one of the servers on one of the largest tables we have. The table is 300 million rows in size and takes about 100GB of disk space. The initial replication worked fine, the problem is with future updates. Sometimes we need to update several million rows in the tables, this causes the latency to go up to several hours. Any performance improvement tips?

  • Transactional replication updates one record at a time on all subscribers by default.

    You have some ability to manage this, but it can be tricky to manage replication once you have gotten outside of the default behavior.

    One reasonable way to deal with this is to bypass replication on really big transactions like this and run the update command individually on your subscribers. Look at the transactional replication sections in books online carefully for some options.

  • Where is your distribution db located?

    Is it on fast disks just like your transactional db that you are trying to replicate? If so, is it on the same set of disks?

    @SQLvariant

  • The distributor db along with the snapshot folder are on seperate disks. For the time being I changed these tables to be replicated via snapshot once a day. This is ok for now.

  • How many disks are in your array?

    I have a similar situation and routinely replicate hundreds of thousands - millions of rows in just a few seconds?

    I guess that a better question to ask would be: What is your Average Disk Queue Length on those arrays?

    @SQLvariant

  • tlogs are on raid1 (2 disks), and the data files reside on raid5 (6disks). BCP files are on raid1. I'm able to push 100s of thousands of rows within seconds, and that doesn't cause issues. The problem is when somebody or a process runs an update command on 10 million rows of data. It will take a while before the transaction makes it onto the other side and the problem gets even worse when there is another process running another update. The log reader falls behind for several hours, but eventually it will catch up. We have one process, which runs on Fridays,it updates entire table, and then the log reader gets behind for days, in this case it's easier to just disable publishing and restart from scratch.

  • haven't tried this myself but would it be possible create multiple publication based on 1 table and use filtering?

    -----------------------------
    www.cbtr.net
    .: SQL Backup Admin Tool[/url] :.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply