Database Design and Reference Tables

  • As soon as I read this article, something inside me said "Lawd, I hope Joe don't see this!"

    Seriously though, Joe's right.  This table design offers zero benefits to outweigh the complexity of query design, the lack of referential integrity, the fact that every rule of normalization is broken and the fact that every single column must be included in the Primary Key - which can't in fact be done since at least one of the columns is nullable!

    The only "benefit" to this design is that, if you're a really lazy DBA or developer, you only have to design and maintain a single table.  Of course, for all the work you avoid up front you'll end up paying in the end with more complex/slower queries and more complicated admin and application design.

    The only reason to store data is so you can retrieve it in a useful form later.  As a general rule of thumb, I find that if a database itself gets in the way of data retrieval it probably needs to be scrapped.

    Some of these articles should probably come with disclaimers: "The opinions and views expressed in this article are not necessarily those of the SQL Server Central staff.  Or anyone else for that matter."

    Bottom line:  this thing is horrendous.

  • Having spent time reading all the posts I've now forgotten what the last column was in the table design.  Needless to say this column represents a flaw in the authors own design in that it contains repeated duplicate data.  Had the author put a little more thought into this, then this column itself would refer back to a row in the MUCK for its value.

    I congratulate the author however for posting and enduring the ensuing feedback and say better luck for future articles if you dare.

  • As soon as I read the article I thought "Thankfully we had the anti-MUCK table article not long ago". I'd like to add my vote to those against MUCK tables particularly after my experience looking at the database of our helpdesk software which is a well known product and uses very mucky tables for lookups. We have lists and lists of lists and lists of sub-lists and lists of sql statements to lookup the lists. And the link tables are equally horrendous. Oh for a good clean referential integrity, normalised design.

    There is a hope though as there are moves afoot to replace this product due to its inadequacies.

     

  • I found a few flaws in the generic look up table design

     

    1. You are retrieving the data from look up table based on a string rather than int, it slow downs the peformance.

      SELECT intID,strCode,strDescription FROM dbo.GenericLookUp (NOLOCK) WHERE strCategory = @strCategory

    2. Adding to it the table "Generic" look up table is not adhering to the normalization rules






    Regards,
    Sudheer 

    My Blog

  • I'm fairly sure I met this class of design in the early nineties where it was introduced to me as a CLUTs (Combined Look Up Tables) table.

    As it is clearly despised, and with good reason, in relational database design; could someone please tell us what is best practice when faced with a requirement like the following:

    'Each customer data entity (table) shall contain 2 customer configurable coded attributes.  As valid values for each of these attributes, the customer may either define their own code set or attach an already existing code set.'

  • I doubt that I can add anything meaningful or cite any better references for why a single lookup table isn't a good idea.  I inherited a database that has a single lookup table and we have been working diligently to eliminate it by incorporating separate lookup tables and eliminating those rows from the lookup table.  We've had no end of data integrity issues not to mention that the table is used for a dumping ground for any name/value pair used anywhere in the application. Heed the advice of the many people posting here--avoid single lookup tables.

  • That depends entirely on what you specifically mean by "customer configurable coded attributes".  Are you talking about a couple of columns for which the customer defines the valid values?  If so, you could just add 2 tables with allowable codes in them and set up referential integrity rules to your 2 columns.  The customer would define allowable values by updating the tables containing allowable values.  "Attaching an already existing code set" could be done by copying existing values from other tables to your new tables.

    Of course it's hard to give an precise response without precise details...

     

  • Customer configurable attributes can be a huge problem.  The fact that commercial software so often can't properly define the requirements for all of their potential customers is one of the major factors behind the terrible database models that are so often used.  Not coincidentally, it is also a major factor behind the difficulty involved in integrating systems.

    The example you gave should still be handled by two seperate reference tables.  Your Customer table would need two varchar columns that would each reference another table (call them CustCode1 and CustCode2) that could consist of a Code and a maybe a Description column.  This would allow the customer to define their own codes and meanings but still constrain the values to be only those defined for the particular column.

     

    /*****************

    If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek

    *****************/

  • We enforce RI using the  check constraint with UDF's. This UDF is common, which takes two values, the key type (example CARD_TYPE)  and value (VISA, MASTER etc), queries the lookup table returns a 1 or 0 if valid.

    #1 - You cannot possibly get better performance by manually performing RI functions in T-SQL than you can using the built-in SQL Server RI functionality.

    #2 - Everytime you add or remove an entry from the "generic lookup table" you have to also modify your code; whereas with a normalized database design I can simply add or remove an entry from the table, job done.

    #3 - Also if you remove an entry from the "generic lookup table" you have to manually check every table that "references" the "generic lookup table" to ensure that you don't leave any orphans behind.  I.e., when you decide to stop accepting the Diner's Club card (is that even still around?) you have to manually check all referencing tables to ensure there are no references to the Diner's Club card.

    #4 - The performance on your queries is probably nowhere near where it should be.  I'd love to compare query plans with you using your method versus a properly normalized design.

  • Thank you Mike

    Essentially I am talking about a couple of columns for which the customer defines valid values.

    Certainly defining a separate table to hold the valid values for each such attribute would work, and conform to relational integrity rules as far as each such table went.

    Sadly, as soon as a customer wants to reuse an existing code set; we are left with multiple tables containing the 'same' data.  Not only will the contents of these tables drift apart over time, but the drift will be different at each customer site.  A major maintenance headache in its own right.

    Is there a well principled way to allow the customer to link the attribute to an existing table of while only choosing that table after installation?

  • Not unless you are willing to allow them to alter the table (by adding a FK constraint).  This can get pretty messy when it comes time to roll out upgrades etc... also the datatypes might not be the same either.

    /*****************

    If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek

    *****************/

  • One solution might be to create views which could be altered later if the user wanted to change attribute sets, or you could dynamically generate SQL queries.  You can ensure that all columns have the same names for your client-side app by aliasing them.

    Of course the altering would require the user to have what I would consider "extraordinary rights" in the database...

  • That would be one other major flaw I'd want to avoid .

  • I don't see how defining views can possibly help this situation.  Views can't be referenced by FK constraints.  Using such a method throws you immediately back to triggers or UDF's to enforce integrity. 

    And like I said, allowing customers to modify the database leads to major support issues down the road. 

    /*****************

    If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek

    *****************/

  • Not even with Indexed Views which speaks volumes about their "usefulness".

    Like I said, dynamic SQL and modifying the tables themselves are a way to do this and maintain proper DRI.

    And like I said, that means giving the user "exceptional" rights to the database.

Viewing 15 posts - 46 through 60 (of 101 total)

You must be logged in to reply to this topic. Login to reply