A Question of Scale?

  • One of the knocks over the years on SQL Server has always been it's ability to scale. Oracle has always been the leader and has been used in many environments with extremely large numbers of users or a large amount of data. SQL Server is usually regulated to second fiddle, or third behind DB/2 when it comes to handling large loads.

    SQL Server 2000 was supposed to change a lot of that perception with very large TPC-C numbers. And not just the price/performance numbers, but the overall performance numbers. Yet still the perception has been that SQL Server can't handle the load. When I managed the DBA group at Peoplesoft, the DB/2 and Oracle DBAs wouldn't dream of loading any of our enterprise systems on SQL Server. After all, they ran on a 12 way Sun box or an 18 way IBM box, how could a little 4 way SQL Server box handle the load. My contention was that I wouldn't hesitate to run SQL Server, albeit on a larger box and I'd stake my job on it. Nobody ever took me up on the offer 🙁

    So with expectations that SQL Server 2005 would increase that scale and the keynote at TechEd showed it handling much larger loads than SQL Server 2000, I was surprised to see this article in Redmond Magazine that referenced this Forester report, I was stunned.

    The gist of the report, apparently since I declined to buy it, is that perhaps SQL Server isn't a better choice than Oracle or DB/2 and not yet ready to be considered their equal in terms of it's scalability. The Redmond Mag article mentions that Microsoft contests the report (duh!) and says that the benchmarks Microsoft has published, while better than SQL Server 2000, aren't the best in either category.

    First of all, I'm a Microsoft guy. I'll defend SQL Server and I believe it can handle most loads. It might not be the best for the largest companies, but how many of us work with the largeset workloads, even the top 5%? I'll tell you, 5%. That means that 95% of us would do great with SQL Server and save a boatload of money. Plus I'm irked that this "report" of supposedly good reporting costs $49. What better business model is there, charge money for something that people can charge to their company and use to support their point of view.

    And get two Oracle guys to write it.

    Steve Jones

    PS, and Craig agrees it's too soon to knock 2005.

  • You tell 'em Steve!

    I personally administered a 1.6 TB data warehouse on SQL Server 2000. It can be done, it requires knowledge of the system and data to tune. While it may be true that Oracle could have done the job with little or no tuning (not to mention about 10 times the cash), it begs the question: What are those Oracle folks getting paid so much to do if they don't have to tune their databases?

    :{> Andy

     

     

    Andy Leonard, Chief Data Engineer, Enterprise Data & Analytics

  • I went to a SQL2005 tech-ed briefing and was told that ultimately SQL2000 was limited however no expense has been spared on SQL2005 to ensure that it does scale.

    With SQL2000 I think it is a question of "scale to what"? As you said in your article, how many organisations really need such massive scalability.

    I don't want to imply that SQL 2000 is merely "good enough" and no more because the articles you note say it dominated on managability and usability. For most people this knocks the opposition into a cocked hat. This is why Microsoft has had such a huge dominance in the desktop market as well.

    What is the point of having the ultimate database server if your staff have to fight it to get the best out of it?

  • SteveJ, thanks as always for a thought-provoking editorial.

    On this topic, I'd like to see a discussion on the practical ramifications of the difference in locking methods between SQL Server and Oracle.

    My experience has always been with moderate-sized databases with moderate loads - nothing too heavy-duty. But Oracle guys always tell me once you have a "large enough" rate of transactions with a "big enough" database, SQL Server will start to bog down because of lock escalation.

    The claim is that this is an inevitable result of SQL Server's "readers block writers and writers block readers" locking philosophy. That is, someone reading from a resource will be granted a Shared lock on it (at record, page, or table level, as appropriate) and while that Shared lock is in place, someone else can't be writing to the same resource. Likewise, a "writer" will place an Exclusive lock on a resource which will prevent a "reader" from reading from it. (Various types of locks in SQL Server are structures in memory.)

    Oracle apparently does it differently - locking information is stored within the data row itself. A "reader" will not lock the row from a "writer", nor vice-versa. If a resource is being written to, any "readers" will get the data from a Rollback Segment , which is an image in memory of the data before any changes were made to it. The point is that readers will not see any uncommitted changes. Because of this behavior, they say, Oracle doesn't start panting for breath when transaction rates get very high.

    (I should have mentioned that the above discussion assumes a Transaction Isolation Level of Read Committed in both cases.)

    A related thing that they always cite is that SQL Server does automatic lock escalation that can't be tuned. That is, although SQL Server will try to lock at the finest granularity, say, record locking, once there are "too many" record locks (which are structures in memory) SQL Server will start escalating them to Page or even Table locks.

    I have to admit that the Oracle argument here sounds convincing. What I don't know is how it pans out in practice. For example, do the Rollback Segments (I think they're Undo Logs in the latest version of Oracle) have their own problems in terms of tuning and overhead? Also, in the case of SQL Server, do the TPC benchmarks somehow excersize lock escalation?

    What are some examples of actual companies using SQL Server for actual large-scale, high-transaction applications, and how do they do it?

    Also, I understand that SQL Server 2005 will have something that acts like the Oracle Rollback Segment in the form of the new Snapshot transaction isolation level.

    Well, I hope I've been able to instigate a good discussion on this topic

    Best regards,

    SteveR

  • You say that the locking issue only applies at the default isolation level of READ COMMITTED

    Surely this knocks the ORACLE guys argument for six because although it is the default isolation level it can be specified as something different for a particular transaction.

    Surely this makes it a database design issue rather than any inherrent fault in SQL Server?

    Also, the terms "large enough" and "big enough" are not quantitative terms. It's like getting a LA weather girl to describe a cold day!

  • Or an Mexican describe a hot day .

  • Hi David,

    >>Surely this knocks the ORACLE guys argument for six because although it is the default isolation level it can be specified as something different for a particular transaction.<<

    right, you could  use SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED, for example, but that might not be appropriate for your application - readers could be reading data that's in the middle of an update transaction but eventually gets rolled back.

    But again, I don't know how this works out in practice on "big" databases - I've never experienced any real locking problems, but I just haven't played in the Big Leagues yet.

    As far as what "big" means, that's what I'd like to hear about from the people on this forum - have you run into a situation where at some point SQL Server seemed to bog down on locking issues.

    Best regards,

    SteveR

    P.S.  - those who know me know that I'm a BIG Microsoft booster. As I always say, Bill Gates puts food on my table and pays for my mortgage every month, paid for my son's Bar-Mitzvah etc. etc.  I'm not at all interested in jumping ship to Oracle - but I'd like to know more about how these locking issues pan out in real life.

  • We've got some stuff in the Sept magazine on various scale and HA issues coming. Might answer some questions.

    As far as scale goes, I can point to two places where I've run into scale issues with SQL Server 2K.

    1. We ran a large DSS system on a 4way/4GB/Ent Edition box and ran into thread starvation when we had more web users than we expected, usually over 40, hitting the front end. This was mainly due to the nature of the queries, and we solved be rearchitecting some of the schema and changing a few of the query hints where we could to grow the load. Granted, 4x4 isn't very large, but it was a scale limit we hit on that box. The queries were app generated (Cognos, I think) and ran about 80-150 lines of nasty SQL code.

    2. We had a 2x4 box, std edition backing an intranet site. The content publishing system was heavily db based and we ran a load test, 1000 concurrent simulted users, equates to like 8-10k real users. The profile showed about 11-12k queries / sec, read only, that maxed out the box. Capturing just stmt ends generated about 40MB of profiler trace/sec. Needless to say, the web server appeared to be hanging somewhat, but the SQL box was just limited. It was running full out. Never got $$ to run to a bigger box, but it was a limit we hit.

    I have seen similar limits on DB/2 in systems where the system just pegged. Any box can be pegged with a big enough load. Until now, there have just been bigger *nix boxes out there. Now you can run 32 or 64way Windows boxes, but it's rare, so most of us just don't get the chance to compete with more than a 4way or maybe an 8 way. The Oracle guys easily get hardware up over 16 CPUs and claim more scale.

    Now the locking mechanisms are different and ORacles allows for more scale, but potentially you could be reading data that's changed. More importantly, the lock escalation that SQL does, which makes it WAY easier to admin, means that sometimes you get blocking due to scale. Supposedly the snapshot isolation in SS2K5 will help with this, but we'll see.

  • I've seen an ORACLE database slaughtered by a web content management system in testing, but in real life it coped just fine.

    One issue I have seen with testing is coming up with a real world test. For example 60-70% of all web site traffic goes to the site home page. The lion share of the remainder goes to pages that can be directly linked off that home page therefore stress testing a web site has to use a model that represents that.

    The aforementioned test that used the ORACLE system the customer runnng the test wanted to proove that our code wasn't up to the job so they deliberately skewed the test profile to run all the complex dynamic pages.

    On the SQL Lock thing does the following argument hold any water?

    Locks are managed in memory therefore if you have a lot of locks you get a geometrically increasing performance burden. More locks = less memory available, less memory available = slower response, slower response = locks held for longer, locks held for longer = more lock conflict etc.

    In a large system there will be a large number of tables that simply aren't updated that frequently. They are practically read-only. There are also tables that are updated more frequently but certainly not continuously. If these tables are going to be read then why not read them with the NOLOCK hint? Dirty reads are a non-issue because of the more-or-less static nature of their contents.

  • I am a vb.net program by connect database sql server 2000.  I want to sql data back-up by coding in vb.net.  When I shoud run my program and when I run back-up the back-up of sql database should be done.  please reply me on currect coding or sql query by e-mail rupendra_bainsh@yahoo.com

    Thanking you,

  • Start a new thread for a new question, you will get more replies that way.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply