911: developers insist PK/FK constraints impede performance

  • I just walked out of my first dev meeting at a new job and I'm stunned to learn that the developers NEVER used primary key/foreign key constraints because they said they degraded performance.   The developers stated that RI was handled in the middle tier.  

    I can provide more details ofcourse but I had to post this and get some reaction.   Is there any truth to this general assertion????    

    TIA

    Joe

  • I have seen this before, what they are realy saying is that it would be more difficult to recode there work. It would not supprise me if there are orphaned rows in the base.

  • If I remember correctly, SQL Server is able to use the PK/FK relationships when it computes query plans to produce plans that are a little more effective.

    I can see that there might be some performance loss as the server is doing a check constraint to ensure that the data is valid, but I can't imagine what scenario they've got where the performance hit would be measurable.

    Besides, what is to stop someone with direct access to the database from "fixing" some values and accidentally breaking their middle tier RI?

  • Ask anyone who has worked on an app, like older versions of Siebel CRM for example, where there is no RI on 2000+ tables and ask them how well the business components maintain RI.

    As to whether RI degrades performance, the answer is usually "it depends" - if the database design itself is suspect, or if the indexing strategy is not up to par, then of course there will be performance issues. Consider inserting/updating a table with 10 foreign keys, where the 10 referenced tables containing the Primary Keys aren't indexed correctly, or don't have a clustered index and are horribly fragmented heaps. Performance will suck, but you can't blame the RI, you can hire a better DBA  

  • Hey, I agree with the developers that it degrades performance.  I hate DRI.  I have seen it bog things down unbelievably.

    However, in my applications all data modifications happen in stored procs where I do my own RI checks as required.  I,U,D permissions are denied to any user of the application.

    So, if you are going to complain about developers being wrong for not using DRI, I would say you are wrong to allow any insert/update/delete from the client in anything beyond the simplest of applications.

     

  • thanks to everyone for the quick replies.

    To further the discussion, I'm really concerned that one of SQL server's important features was totally bypassed because there was a line of thinking that DRI would be a significant performance hit.  I've done a fair amount of perf tuning and DRI has never been an issue.  The main issues I've encountered were: blocking, poor index strategy, poorly written queries, lastly hardware that was not up to the job.

    After i heard that DRI was not used, I immediately thought: how could RI checks be faster using higher level code?  i.e. stored procs.    SQL Server's DRI is low level and highly optimized by a company the literally spends a billion dollars or more developing and improving their product.  

    Would really like to hear further comments on this issue.   Thanks!

  • Sorry - I just can't go along with that, John.  In anything close to a complex application with potentially hundreds of stored procedures you can't tell me that you've got every conceivable RI condition covered thoroughly by the code alone.  Furthermore, what do you do if you ever need to do manual data fixes (and any application will need this at some time - that's life)?  Any mods to the data in a table with several potential child tables is just asking for trouble if the foreign key constraints aren't in place to hold your hand for you.

    So long as you've indexed all your FK columns correctly, then I don't believe that your average application will notice any significant performance degradation.  Of course, if you need to squeeze millisecond-level performance out of your database for some reason, then you may have a very specialised app where you could justify removing constraints.  However, I believe that's the exception, not the rule.

  • That is a frightening statement to see from someone that has posted almost 300 messages to these forums. God help any newbies that heed your advice!

  • This should be fun.  I'll get the popcorn.

     

  • Please, let us know in which way DRI degrades performance? Simply stating that "it bog things down unbelievably" is ridiculous.

    CREATE TABLE dbo.foo (a INT NOT NULL, b INT NOT NULL)

    CREATE TABLE dbo.bar (c INT NOT NULL, a INT NOT NULL)

    Maybe you can now explain to me in which way performance is degraded after I add a primary key to foo.a and a foreign key to bar.a referencing foo.a. In which situations is performance degraded? Read, write, both? Why?

  • Here are some good questions:

    1. What is the cost to have orphaned/bad data in the database?  (Of course it will never happen)?

    2. How do you know what is impacted if you drop a table? We don't need this table any more so let's drop it.  Ooops why does the monthly report no longer work?

    3. Have you run/do you run regular checks looking for violation of RI?  Try it and see what you get.

    I will state that I do know that StarTeam by Borland/StarBase is a professional product that does not use RI at the database level.  I believe it is done so that they can be more database independant as it works with multiple databases.  In fact it used to work on any ODBC database, but they have changed that.  They also have their table and columns names encoded so they aren't easy to "hack" so they spend extra time and money to ensure the RI works.

    So yes you can do it.  You should have a good reason for doing it, and performance is a questionable one.  If there is lots of deleting going happening on a parent table with lots of children or grand children, then yes I could see issues.  In general as a DBA I am paranoid.  I make mistakes, programmers make mistakes, so I hope you have a good testing team that checks for RI problems.

  • I've worked on about a dozen systems, some that I helped design and some from third parties. There hasn't been _any_ database enforced relational integrity in any of them.

    There are a lot of exceptions that would lead to disaster if the database enforced cascading deletes. In many cases, records have to be moved to history/audit tables before being deleted. In some cases, you merely need to mark the child records with STOPDATE, sometimes you have to reparent them, or sometimes orphans are ok. Think of the business rules first!

    Are the business rules such that you can allow the database to delete rows based on simplistic relational integrity rules? Sometimes yes and sometimes no.

    If the business rules are at all complicated and you wanted to use built-in relational integrity enforcement, it would take substantial engineering effort to design appropriate triggers and then test them in all possible cases.

    Usually the business rules are complicated and that is why we design n-tier systems so that we can write complicated logic for relational integrity in a high level and flexible procedural language in a middle tier rather than be shoe-horned into simplistic cascading deletes based on foreign key constraints.

  • I once showed up for a job interview where the development team showed me the scariest database diagram I had ever seen.  Suppressing the urge to run away, I asked why there were no DRI relationships.  Their answer was that foreign keys just get in the way during development, they would add them at the last minute when it went into production.  In the most amazing example of instant karma I have ever witnessed, they were all fired the next day and I was the first member of a new development team that started from scratch.  The only evidence of their work that remained was that horrible diagram that we brought out occasionally when we needed a laugh.

    True story.

  • Having posted previously that I haven't seen FK relationships enforced in the database, I want to say that all the systems that I've worked on did have _logical_ FK relationships defined in some way.

  • That's amusing (though the development method sounds eerily familiar in some aspects). Reminds me of a story from http://www.thedailywtf.com where a guy went in to discuss some possible work. He discussed the current structure with the team and found out that the developer had created a PK based on customer address. When he dug a little deeper as to how that worked, he found out that the developer had been manually fixing addresses to make them unique (by prepending data). The boss commented briefly on the increasing volume of returned mail. From what I recall, that developer was not there for the next round of chats.

Viewing 15 posts - 1 through 15 (of 40 total)

You must be logged in to reply to this topic. Login to reply