AlwaysON with more than 5000 databases

  • Recently I received the task to build an AlwaysOn Setup with more than 5000 databases in four SQL Server instances . Customer wants to use their current hardware that doesn't support Windows 2012. In resume the configuration needs to be as follows:

    - 16 cores and 64 GB of RAM.

    - 4 SQL Server Instances: two of them with more than 2500 databases each and other two with only 5 databases.

    - Three nodes cluster, two on the main site and the third in DR with synchonous replica in the main site and asynchronous in DR.

    - SQL Server 2012 enterprise SP1 CU8

    - Windows 2008 R2 with latest SP and recommended patches for AlwaysOn.

    - Databases are very small with an average of 5 GB or less.

    - Concurrency could arrive to 20% of the databases.

    - I've measured the amount of log data being used currently and one Gb nic should be enough but I have the opportunity to add more bandwith using teaming. I think that an async setup in main site could be also posible but the first idea is to configure it with sync and automatic.

    Currently I'm building a PoC just to test if it's posible to configure it, test under load conditions and failover and failback AGs. We asked Microsoft about it and they said that they tested it with 10 AGs with 100 Dbs Each and anything beyond this needs to be tested. As an "alternative" they suggested us to use logshipping. The requeriments for this custormer makes imposible to use logshipping and the amount of administrative effort makes it imposible to manage.I know that AlwaysON will be massively dificult to manage too.

    In my previous tests, I think I will end my first real tests in two days with more hardware, I found that AGs unexpectelly went out of sync and I found many errors with laks of resources related with lack of workers. I raised the number of workers without any improvement. If the AG is out of sync nothing that I've tested can make it work again with the exception of restart SQL Server (restart endpoints, reconfigure any AlwaysOn parameters, replica timeouts ... )

    Any idea ?? any experience with it or suggestion ?

    Thank you in advance,

    J

  • JoseM (2/4/2014)


    Recently I received the task to build an AlwaysOn Setup with more than 5000 databases in four SQL Server instances . Customer wants to use their current hardware that doesn't support Windows 2012. In resume the configuration needs to be as follows:

    - 16 cores and 64 GB of RAM.

    - 4 SQL Server Instances: two of them with more than 2500 databases each and other two with only 5 databases.

    - Three nodes cluster, two on the main site and the third in DR with synchonous replica in the main site and asynchronous in DR.

    - SQL Server 2012 enterprise SP1 CU8

    - Windows 2008 R2 with latest SP and recommended patches for AlwaysOn.

    - Databases are very small with an average of 5 GB or less.

    - Concurrency could arrive to 20% of the databases.

    - I've measured the amount of log data being used currently and one Gb nic should be enough but I have the opportunity to add more bandwith using teaming. I think that an async setup in main site could be also posible but the first idea is to configure it with sync and automatic.

    Currently I'm building a PoC just to test if it's posible to configure it, test under load conditions and failover and failback AGs. We asked Microsoft about it and they said that they tested it with 10 AGs with 100 Dbs Each and anything beyond this needs to be tested. As an "alternative" they suggested us to use logshipping. The requeriments for this custormer makes imposible to use logshipping and the amount of administrative effort makes it imposible to manage.I know that AlwaysON will be massively dificult to manage too.

    In my previous tests, I think I will end my first real tests in two days with more hardware, I found that AGs unexpectelly went out of sync and I found many errors with laks of resources related with lack of workers. I raised the number of workers without any improvement. If the AG is out of sync nothing that I've tested can make it work again with the exception of restart SQL Server (restart endpoints, reconfigure any AlwaysOn parameters, replica timeouts ... )

    Any idea ?? any experience with it or suggestion ?

    Thank you in advance,

    J

    It is my professional opinion that you have little hope of success on this project given the current hardware. That is a LOT of threads just locked up for mirroring purposes. 1Gb NIC doesn't leave much wiggle room for the "are you still up" checks that used to be done on the private network either. You are also carving up your RAM with a good bit of "wastage" in the 4-instance arrangement. With SSDs, LOW latency network and probably more cores, you might have a chance.

    I see you already went here: http://blogs.msdn.com/b/psssql/archive/2012/05/17/alwayson-hadron-learning-series-worker-pool-usage-for-hadron-enabled-databases.aspx

    and here: http://blogs.msdn.com/b/sql_pfe_blog/archive/2013/07/15/monitoring-sql-server-2012-alwayson-availability-groups-worker-thread-consumption.aspx

    I have a client for whom I was able to coax SQL Server 2005 to handle 7400+ databases, on 2006-era hardware no less! I can tell you that EVERYTHING I tried to touch that server with barfed on the metadata (including tools from pretty much every major tools vendor at the time) of that many databases. NOTHING automated worked (even log shipping). Thank G-d I WAS able to use SSMS!! I built a custom log-shipping mechanism that handled the backups and restores to a second machine that was used as a warm standby AND read-only reporting box. Worked like a champ once the kinks got worked out. It's still running (on the same hardware!).

    So the question is WHY won't the client entertain tlog shipping?

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply