Foreign key constraints...WHY??

  • Steve Jones - Editor (4/3/2010)


    I'd argue that a large CRM/ERP application without FKs is just poorly designed. There are all sorts of support/administrative contracts in place to prevent you from working with the data because they have poorly designed things and are depending on the application to handle all integrity.

    A poor method, IMHO.

    .

    Too true. The MS Sharepoint agreement is so restrictive that just issuing a QUERY against the Sharepoint database violates the support agreement.

    That's right, you read that correctly.

    Issuing a QUERY in a DATABASE is considered so DANGEROUS that it might BREAK Sharepoint.

    Jeesh.

    Is that ironic, or what? "Gosh, Betty, it's not safe to query our database, no telling what might happen. Spontaneous combustion or somewhat..."

    As Bugs Bunny would say about the folks who built a system that badly, "What a bunch of ultra-maroons."

  • vidya_pande (4/3/2010)


    I have seen huge CRM application (Product) is with out FK's. These databases are in hundreds of GB's.

    One advandate is definitely there - hiding the DB design.

    Also i have seen large data warehouse database without FK's. Huge data loading is done in to these databases on daily basis.

    May be giving better performance.

    :w00t:

    Good points. I would say that a Warehouse implementation is not the same consideration. There are several different considerations for a warehouse.

    I have worked with Tb databases without FKs and have worked with TB databases that have FKs. With FKs, is easier to work with.

    Though, I certainly concede the point on CRM. I'm not saying it is a good design (i think Microsofts CRM needs plenty of work).

    I also see the point on trying to hide the design. However, hiding the design doesn't really help anybody - it only provides a nuisance to anybody that truly wants to know the design.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Paul White NZ (4/3/2010)


    CirquedeSQLeil (4/3/2010)


    If you query a large table joined to another table that does not have proper referential integrity in the database, data accuracy is thrown out the window. Furthermore, performance will be worse. Proper database design requires referential integrity and proper indexing.

    I am intrigued why "performance will be worse" without PK-FK relationships.

    It is certainly true that the optimizer can occasionally make use of a checked and enabled relationship to add extra implied predicates, or (very, very, rarely) to eliminate a join altogether...but it is also true that, in general, PK-FK relationships are a net cost overall - and potentially a very significant one for OLTP workloads. In fact many cases I see of extra 'implied predicates' add nothing of value and simply 'state the obvious'.

    I have not seen a FK cause database performance issues such as you are indicating. However, I can show database performance that flat out sucks due to the lack of it.

    Again, this is not my experience, as indicated above. Proper design, appropriate indexing, keeping transactions short, (and so on) are much more important for overall performance. I still cannot conceive of a database design that, while otherwise sound, would produce sucky performance simply because a PK-FK relationship did not exist.

    PK-FK relationship checking can also be the cause of hard-to-track down deadlocks. Not too many people realize that a SELECT on a child table can hold an X lock on the parent in some circumstances, for example.

    There is a good reason that the application layer and data layer are separate layers - this is one of them.

    It simply isn't good enough to state that entity integrity must be enforced by the database layer, and anything else is somehow irresponsible or dangerous. The reality of things is such that many fine systems are built with integrity enforcement built into the mid-tier. You may not like it, I may not like it, but there we are, nevertheless.

    I don't mean to rant, but the situation is not at all as clear-cut as database people frequently imagine it to be.

    Relational integrity isn't something you're supposed to be able to "choose". It's like security: if you build it as an "opt-in" facility, and leave all sorts of ways to bypass, there IS no security. The data layer is there to maintain such things, and - it's one of the 10 laws of a RDBMS (that's right - that little document from Codd).

    In my view - the Interface layer is absolutely responsible for helping to maintain the DRI, but having a system you can simply bypass and wreck the validatity of your data, well - that invalidates it as an option for ever being the "chief cop in charge of relation integrity".

    And fankly - "performance" doesn't play into such things. I will fight the battle to fix the perf issue, but - giving up on DRI is just not an option.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Steve Jones - Editor (4/3/2010)


    I'd argue that a large CRM/ERP application without FKs is just poorly designed.

    It's certainly arguable 😉

    On the other hand, there is an element of "well, we would say that wouldn't we?" going on here.

    For every database person that argues that constraints belong in the database, there is a developer that argues that it belongs in the mid-tier, and an architect that doesn't really care what the other two say, and will go his or her own sweet way, thanks very much 😉

    One part of the argument revolves around where integrity enforcement conceptually belongs. This debate between database people, developers, and architects tends to get quite tedious quite quickly, at least in my experience.

    Developers typically argue that relationships and business rules should be enforced together, in one place, using common code. Architects typically want to use whatever cool new technology has just been invented to solve all problems everywhere, at all times, possibly using magic. Database people bang on about Codd and can be upset by people suggesting that databases are just for storage. All three are convinced that the power of Right is on their side.

    In the real world, we have to compromise, and there is a whole spectrum of valid designs, each with its own strengths and weaknesses.

  • Paul White NZ (4/4/2010)


    One part of the argument revolves around where integrity enforcement conceptually belongs.

    ...

    In the real world, we have to compromise, and there is a whole spectrum of valid designs, each with its own strengths and weaknesses.

    Quite true!

    I start with the premise that the data is supposed to be valid. That's, on occasion, NOT a business requirement at all. (Sometimes the business requirement is to BE SEEN to have an application on a topic and pretending it works correctly is just fine.)

    Assuming the data is supposed to be valid, where can data be entered into the system incorrectly?

    The front end, the middle tier(s), and the database tier are all possible locations for the introduction of invalid data.

    If the validation is in the front end, bad data can be entered via the other tiers.

    If the validation is in the middle tier(s), bad data can entered via the database tier and any middle tier closer to the database than the validation tier.

    If the validation is in the database in stored procedures, bad data can be entered via the database tiers directly into the tables.

    Only validation code directly part to the table definition can't be circumvented by accident. (Trigger code isn't a direct part of the table definition and (sadly) can be circumvented because some bulk load routines don't fire the triggers.

    If you don't trust that your users or programmers (current or future) will always do it right, and you value the validity of the data, you put the validation as close to the table as you can get it.

    The more risk you are willing to take on that score, the further you can move the validation away from the table.

    I don't trust users or programmers (and that includes me! 🙂 ) to do it right, so I prefer database enforcement of validation.

  • david_wendelken (4/4/2010)


    I start with the premise that the data is supposed to be valid. That's, on occasion, NOT a business requirement at all. (Sometimes the business requirement is to BE SEEN to have an application on a topic and pretending it works correctly is just fine.

    :w00t: please tell us you do not work on a nuclear facility 😀

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • PaulB-TheOneAndOnly (4/4/2010)


    david_wendelken (4/4/2010)


    I start with the premise that the data is supposed to be valid. That's, on occasion, NOT a business requirement at all. (Sometimes the business requirement is to BE SEEN to have an application on a topic and pretending it works correctly is just fine.

    :w00t: please tell us you do not work on a nuclear facility 😀

    And if I do, which tier would you prefer that data validation takes place in? 😀

  • david_wendelken (4/4/2010)


    PaulB-TheOneAndOnly (4/4/2010)


    david_wendelken (4/4/2010)


    I start with the premise that the data is supposed to be valid. That's, on occasion, NOT a business requirement at all. (Sometimes the business requirement is to BE SEEN to have an application on a topic and pretending it works correctly is just fine.

    :w00t: please tell us you do not work on a nuclear facility 😀

    And if I do, which tier would you prefer that data validation takes place in? 😀

    My guess is that he would prefer it to take place inside an Oracle database :laugh:

  • PaulB-TheOneAndOnly (4/4/2010)


    david_wendelken (4/4/2010)


    I start with the premise that the data is supposed to be valid. That's, on occasion, NOT a business requirement at all. (Sometimes the business requirement is to BE SEEN to have an application on a topic and pretending it works correctly is just fine.

    :w00t: please tell us you do not work on a nuclear facility 😀

    Heh... taking David's paragraph out of context certainly makes that kwip seem valid but if you look at the rest of his post, you'll understand what he means when he says that he "start[ s ] with the premise that the data is supposed to be valid". He's not suggesting that any given "source" of data is presumed to be valid... he's saying that anything that finally makes its way into a table is supposed to be valid and, in fact, correctly identifies many areas were data can become incorrect, inaccurate, or just plain not valid.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • You can't trust the database to enforce foreign keys anyway.

    https://connect.microsoft.com/SQLServer/feedback/details/357419/merge-statement-bypasses-referential-integrity#details

    😛 😀 😛

  • Actually I was more concerned about "Sometimes the business requirement is to BE SEEN to have an application on a topic and pretending it works correctly "... scary :crying:

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • PaulB-TheOneAndOnly (4/4/2010)


    Actually I was more concerned about "Sometimes the business requirement is to BE SEEN to have an application on a topic and pretending it works correctly "... scary :crying:

    True story.

    A good friend of mine, who will remain nameless to protect her from retaliations by the guilty parties, passed on this story of what happened to her while working for a competing six-letter database company starting with the letter "O"...

    Her boss came to her in mid-October and told her he needed an application done by the end of the year. She spent a few days researching it and reported back that she could finish it by March.

    "That's not good enough, it has to be done by the end of the year."

    "Ok, can I have more people?"

    "No."

    "Ok, can I reduce the scope of the project?"

    "No."

    "Then it will be done in March."

    "That's not good enough, it has to be done by the end of the year."

    To save me typing it out again, 3 more times, re-read the above conversation 3 more times.

    At this point, while her fingers were itching to throttle the idiot she was talking to, she remembered a conversation we had about communication skills. So, she changed her approach.

    "Maybe I've misunderstood what you want done. Let's go over my project plan."

    They went over it and he agreed that each task she had listed was necessary, that each task's estimate was reasonable, and that the dependencies between the tasks were correct.

    "Ok. I can be done by March."

    "That's not good enough. It needs to be done by the end of the year."

    "Ok. Can I have more people?"

    "No."

    "Ok. Can I reduce the scope?"

    "No."

    "Then I can be done by March."

    "That's not good enough. It needs to be done by the end of the year."

    Those fingers were really, really wanting to choke the daylights out of this moron. Then she remembered the conversation about communication skills again, and started down a different path, because the fellow wasn't NORMALLY this stupid...

    She started asking questions to try to figure out what was behind this requirement to have it done by the end of the year.

    Turns out the company worked on a "management by objective" policy and his whopping big year end bonus was contingent upon getting this application installed by the end of the year. Since, by mid-October, he had already spent the bonus money, he was really motivated to get it done by the end of the year.

    At this point, my friend, said, "Ok. How about I shift the project schedule around a bit so I can be done by the end of the year, and then I'll spend through March working on bug fixes and enhancements?"

    "That would be great!"

    He didn't care if it worked on December 31st, he only cared that it could be announced to work on that date.

  • david_wendelken (4/4/2010)


    He didn't care if it worked on December 31st, he only cared that it could be announced to work on that date.

    Nice story. I would have laughed harder were it not so very true, and quite common.

  • david_wendelken (4/4/2010)


    He didn't care if it worked on December 31st, he only cared that it could be announced to work on that date.

    BWAA-HAA!!! Great true story. Unlike Paul, I laughed doubly hard because it IS so very true and so common. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • david_wendelken (4/4/2010)


    At this point, my friend, said, "Ok. How about I shift the project schedule around a bit so I can be done by the end of the year, and then I'll spend through March working on bug fixes and enhancements?"

    "That would be great!"

    He didn't care if it worked on December 31st, he only cared that it could be announced to work on that date.

    That's fine project timeframe management not "pretending to have an app that works fine" per-se. 😉

    Not an unseen occurrence at all but a very common practice.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.

Viewing 15 posts - 31 through 45 (of 70 total)

You must be logged in to reply to this topic. Login to reply