Multiple Databases vs. Minimal Databases

  • My company is in the beginning stages of redeveloping our entire system.  We currently have ~500 customers and that number is growing at an increasing rate.  I expect it could be 5000 in a couple of years.  Our current system was designed with 1 db per client.  There are a couple of other dbs that contain common data.  I can see benefits of this design but I also understand the minimal database design where all clients and their info would exist in a single database or a single set of databases.  The biggest benefits to 1 db per client is downtime.  If a single customer is down the others aren't affected.  On the flip side, administration is and will be a nightmare.  Has anyone dealt with this issue before?  If so, what decision did you make and was it the correct one? 

    Any advice is truly appreciated.

     

    Keith West

    MCDBA

  • I never been in this situation but I will go with multiple dbs to reduce the down time and management can handled through scripts....

     

    MohammedU
    Microsoft SQL Server MVP

  • Imagine a million records or more per client.  Imagine all the clients info in one database.  Not too big for SQL but definitely a performance issue.  Now, when you get to the stage that the server needs to be upgraded, for speed purposes, do you upgrade or just buy a new server?  We all know that 95% of the time you buy a new server.  Would it not be easier to move 60% of your client base to the new server?  That way the old server can still perform and everything is just faster.

    5000 Databases?  Yea, its a lot, but it is still better than one database.

    5ilverFox
    Consulting DBA / Developer
    South Africa

  • I currently manage 10K+ databases spread over two servers.  One of them is a reporting/warm standby server that keeps fairly recent data on it from another other server.  It is was a frickin' NIGHTMARE to get everything set up properly for automation (maintenance, backups/restores, etc).  Now that I have a handle on that it is "relatively" straight forward but still a LOT of effort. 

    You are correct about ease-of-restore if one client poops their data.  We have had that several times just in the last few months.  A log-reader application such as ApexSQL Log could allow you to easily and quickly recover/restore a single client's data/bad transaction in a combined structure however.  Japie is also correct that doing maintenance on tables with millions of rows will take extra effort too.  If you can prune the data regularly you could address this issue. 

    On the flip side, with each client having their own data there is NEVER a time where RAM cache will help you.  If a client needs data, odds are it will have to come from disk.  Make DAMN SURE you have VERY capable I/O subsystem on your database server.

    BTW, you will be stunned by how much activity msdb sees with 5K databases.  Having that many also precludes some architectures such as mirroring and replication too. 

    We are about to take the next level actually.  One primary server with 'global' information on it which will be referenced by multiple servers with 2-4K databases each on them.  That first step from 1 to 2 production servers is going to be a DOOZY!!  🙂

    Best,

    Kevin

    P.S.  Sorry, but there is no right/wrong answer for your situation, IMHO.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • i'll take 1 db per client over 1 db to rule them all any day

    if you move everything into 1 db you have to keep track of which table belongs to which customer. you can use different schemas, but I still think it will be a nightmare as customers create new tables

  • Clearly you would not build an application that required new tables for each client.  The original poster implied (to my belief anyway) that they were redesigning an existing application.  The schema changes to migrate multiple clients to single db would be to obviously link each table's data to a specific client with a ClientID type field.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Kevin G is correct.  All client dbs would have the exact same schema.  In fact that is how we create a new client.  We have a client model database that is copied and renamed when a new client is contracted.

    I would definitely use a ClientID field if moving to a consolidated database.

    Keith

  • Security might be another concern... when you have one db for all clients ... what happens if one client deleted another client's data?

    In this case you may endup with creating views...

     

    MohammedU
    Microsoft SQL Server MVP

  • As long as you include a ClientID in your action queries, it would take a real blunder to delete the wrong data I think.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • That is correct as long you give the access to the db through application but I have seen the cases sometimes clients want to generate custom reports which requires db access and it can't be managed based on the clientid...

     

    MohammedU
    Microsoft SQL Server MVP

  • Why??????  If the report used the clientid as a parameter, why can you not control it?

  • One other issue that is cropping up with using multiple dbs is that all of the sql for the application must be dynamic to include the db name.  This makes writing the sql a little more difficult but all of our sql is controlled through procs also.

    Keith

  • >> That is correct as long you give the access to the db through application but I have seen the cases sometimes clients want to generate custom reports which requires db access and it can't be managed based on the clientid...

    That would NOT be allowed at one of MY clients!! 🙂

    As for the dynamic execution, you can get around that in two ways:  1) have sprocs in every client's database.  That way they can reference their own data without having to create query strings that have dbname.dbo.sometable in them.  2) use fully external ADO code without sprocs.  Yeah, I know, laugh it up!!  LOL  But that is what my current client with 1 db per their client does.  And it works fine, if slowly. 

    But with dynamic execution or poorly written ADO code (non-sproc) you are going to have a hefty CPU penalty to pay for compilations.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • I used to work as the lead DB architect for the 4th largest SW company in the world, and we were facing the same issue when we started building the architecture 7 years ago. The environment ended up being 150 servers with 50+ TB of data. When facing this architectural problem, there are several points to keep in mind:

    - Data to be distributed on each database - data that belongs to a single customer and need not be accessed in an aggregated fashion should probably reside only on the database of each customer

    - Centralized data - Data that needs to be accessed in an aggregated fashion (e.g., user contacts, web users, phone numbers, etc.) for each customer may need to be stored in a central localtion and not on each database. This is important for cases where (for example) customer users log in through a portal to check their data, gathering business related information and metrics by your DBA and infrastructure team, and so on.

    Determining the data sets that will be stored in a distributed vs. centralized fashion can be tricky and require tuning / changes.

    - Server licenses - depending on the size of each database, plan ahead and consider the possibility of large costs for SQL licenses.

    - Backups and load balancing - distributed architectures allow you to (relatively) easily balance the load between servers by moving databases around. Backups will also be easier since the backup related loads are spread across servers.

    - Deploying data and schema to multiple databases / servers - back in the days this was a real challenge. These days you can use tools like SQL Farm Combine to deploy all your scripts, schema, and data to all databases in parallel so it can be done fast. This tool will also let you collect data from all your servers, e.g., for reporting purposes, etc.

    - Users / logins issues - this would depend on your specific setup but has to be considered carefully as well.

    Anyways... hope this gives you a good start. I have done tons of work in this setup, so please free to email for questions and such.

    Editor's Note: : Dr. Omri Bahat works for SQLFarms, a software vendor of SQL Server tools.

  • I had exactly the same problem, the same discussion, and the same uncertainty over the way to design database. I have client with 1500 databases that are exactly the same and I perfectly understand challenges in support. On the other hand, I have a client with one database for 500 clients and I see this client struggling over down time, necessity to separate and move data to clients’ site, and performance issues that affect all clients at the same time.

    Now I have a request to create a design for a new application. I’ve start asking myself and start terrorizing my peers with the same questions. After several conversations I figure out one simple rule and clear problem resolution: cost-benefit analysis. It leads me to the flexible and adjustable solution: a hybrid one.

    It is not really how much a customer can pay but how much it will cost to manage customer’s environment. The cost of the support per transaction should be the same for a small and for a big customer thus my answer on one or many databases question will depends on number of operations per specific customer.

    Let’s say I have only one (or two, or five) big customer. There is no doubt I need a separate database for one, two, or five customers. However, at some point even one customer going to need horizontal partitioning because of a regional spread of transactions or just because it is too big to handle in one piece (I have perfect example for that too!).

    On the other hand, if I have a lot of customers that are so small that cost per operation (including support and hardware) is too high to keep them in separate databases. In this case I would probably try to keep them all in the same database. It will decrease cost dramatically. Most likely my design will be a horizontally partitioned (by customer) database. Does it sound familiar with the first case? I think so!

    There is another example in the real world: multi-national bank. I do not think any big bank have one humongous database with all clients in the same place. Why somebody would have it like this? Most likely a bank has regional databases with regional clients in each and mechanism that replicates data if a customer moves to another region. Kind of the same idea should be under my design as well.

    So, instead of taking one side (one database for all) or another (one database for each client), I would stick to a hybrid solution of N databases for M clients with the business rule that links number of transactions per period (for example) to a place of the database. Sure each database will be partitioned per client and per region (for example). The rule is flexible too at the design stage! In this case I have easy way to balance load, cost of support, down time and all other issues with stripping out or moving together databases in any possible way. Does it make sense to anybody? I would appreciate feedback as well.

    Alex Prusakov

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply