Beginning Database Design - Spot the Flaws

  • I've read everything: interesting article and comments!

    FWIW, here's my take on a first-cut design *for just books* [personally I would adjust it to allow reviews on other objects -- such as a web/magazine articles -- but that would break the domain of the q so I avoided doing that].

    Four (core) tables are needed (at least), as follows:

    Book [must be separate since info about book only is stored];

    Author [must be separate since info about author only is stored];

    BookAuthor [intersection table req'd because of m-n relationship betwn Book and Author];

    Review.

    Full details:

    Book (

    id INT IDENTITY(1, 1) NOT NULL,

    title VARCHAR(100) NOT NULL,

    datePublished DATETIME NULL, --some people ignore old(er) books, esp. in technology field

    owned TINYINT NULL,

    status TINYINT NULL,

    dateAcquired DATETIME NULL, --needed if you want to see how many books bought in a yr

    isbn13 CHAR(13) NULL, --nullable becz do you really check isbns when you look at books??

    imageURL varchar(200) NULL,

    isbn AS RIGHT(isbn13, 10)

    ) --Unique,nonclus = id; Clus indx = title

    Author (

    id INT IDENTITY(1, 1) NOT NULL,

    prefix NVARCHAR(10) NULL,

    firstName NVARCHAR(40) NOT NULL,

    midName NVARCHAR(40) NULL,

    lastName NVARCHAR(60) NOT NULL,

    suffix NVARCHAR(10) NULL,

    name AS ISNULL(prefix + N' ', N'') + firstName + N' ' + ISNULL(midName + N' ', N'') +

    lastName + ISNULL(N' ' + suffix, N'')

    ) --Unique,clus = id; Nonclus indx = (lastName, firstName, midName)

    BookAuthor (

    bookId INT NOT NULL,

    authorId INT NOT NULL,

    sequence SMALLINT DEFAULT 1 NOT NULL

    ) --Unique, clus = (bookId, authorId, sequence); FK: bookId REF Book(id), authorId REF Author(id)

    Review (

    bookId INT NOT NULL,

    date DATETIME NOT NULL,

    rating DECIMAL(2, 1) NOT NULL, /*to allow half stars also*/

    review TEXT NOT NULL --or VARCHAR(MAX) if available

    ) --Clus indx = date; FK: bookId REF Book(id)

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • Wow! What a hot subject! I have enjoyed the discussion of using an identity column as the primary key versus some "real" property of the data. I first learned building tables and databases using identity columns for the primary key, but in my current work, they are never used, so I was wondering which is preferred. Some excellent arguements in favor of the identity column.

    By the way, I like Russ's idea to make the Owner column a varchar datatype, and store the name of the person you borrowed the book from, even though this was not the original intent for this column. Putting myself into Steve's shoes, if I were borrowing a lot of the books I want to review (as opposed to buying all of them), this would be a great way to keep track of who to return them to!

  • I like Identity columns. I just don't like them enough to use them without merit.

    I believe they are at their best when strong relations are being used, and when the application is complex. This is especially true when new data is constantly being added. They are good PKs and excellent FKs.

    I also like them because their naming conventions can be kept consistent. Database application development goes a lot faster when you can easily remember the foreign key names. Coding can be lightning fast for creating the maintenance procedures, and for creating recordset procedures. The same thing applies to coding the UI for the application.

    ... and don't forget the beauty of cascading deletes.

    Tom Garth

    Tom Garth
    Vertical Solutions[/url]

    "There are three kinds of men. The one that learns by reading. The few who learn by observation. The rest of them have to pee on the electric fence for themselves." -- Will Rogers
  • If you are going to add the owner of the book (so you know who to return it to) you should consider adding a start (bought /borrowed) date so you know know how long you've had the book.

    Things must be quiet on the DK Ranch with time to read all these books 🙂

  • Here's another idea that (like others) adds to the scope, but seems in keeping with the spirit of the discussion. The initial requirements list rarely covers the needs of the project. We must speculate/ask questions about what is needed to really make the user happy.

    To start: As others have suggested, a separate Authors table seems like a better idea than free-typing author names in a varchar field. What if you want a list of books by an author and you had miss-typed an author's name in a free-form field? What about the user-friendly approach of providing a drop down box so that author names don't have to be fully typed out? Sure, the user SAID, "I know that many books have multiple authors, but this wasn't designed to be a full scale, searchable system." But then the sneaky guy turned around and said in a later comment: "There will be queries by author (list by author),.." Ahh. That's telling. It's our job to probe to get to what the user really needs.

    Those are arguments/reason for having a separate Authors table. Here's the new idea to scratch that itch that the original user didn't think to ask for: Have some kind of column for rating the author. I have found that sometimes I may not like a particular book, but I thought that the author had a lot of potential. My own book-tracking app lets me mark whether or not I would like to read more books by that author. Why bother? Suppose I see in the future that I didn't like a book by that author (the book itself got a low rating). With the author rating at my disposal, I might still pick up another book by the same author instead of assuming that the author herself was not worth continuing with. I read so many books, I need these reminders. Having an author rating also gives me a quick way to get a list of my favorite authors. It's more direct than getting the list from the book ratings.

    In my database, I did something as simple as a bit/True-False field on the author. Do I want to read more by this author? But it might be better to rate on a scale, doing the tinyint thing. Any time a set number of options are expected, I never allow the user to freely enter the value without check contraints and usually a lookup table. In this case, I would make the scale a separate (lookup) table. One of the benefits of a look up table is that I can use multiple columns in the table to define what I mean by each rating. What does a 1 mean? What does a 5 mean? (Ex: 1 is couldn't get through the book the first time, never touch that author again!. 5 is best, undisputed author in the world. Etc.) Explaining what the ratings mean will keep me/Steve honest over time. This way, two years from now, a 5 is more likely to mean the same thing that a 5 means today--because the description would be there. Another benefit of a lookup table over a simple check constraint is that it is easier in the future to modify my scale (change rows in a table) in the future.

    Here's another thought: Unlike me, I would bet that Steve's reading tastes fall into more than one genre. Some books might be technical. Some mystery. Etc. A column and lookup table for type-of-book would be a great design idea. The user may not have thought of it up-front, but who wants a simple list of the 10 best books when it includes both books on SQL Server and the latest romance titles ;-). It's comparing apples to oranges. We better find out what kind of books Steve is reading. If he wants to track more than one genre, that calls for another column in the main Books table and another lookup table (with foreign key constraints defined).

  • JJ B (11/20/2007)


    Here's another thought: Unlike me, I would bet that Steve's reading tastes fall into more than one genre. Some books might be technical. Some mystery. Etc. A column and lookup table for type-of-book would be a great design idea. The user may not have thought of it up-front, but who wants a simple list of the 10 best books when it includes both books on SQL Server and the latest romance titles ;-). It's comparing apples to oranges. We better find out what kind of books Steve is reading. If he wants to track more than one genre, that calls for another column in the main Books table and another lookup table (with foreign key constraints defined).

    Following that line of thinking, it would add an extra table, but if you wanted to do genre I would add a tag table.

    Something like:

    Tags (

    bookId INT NOT NULL,

    tag varchar(15)

    )

    This would allow multiple tags per book, quick searches by tags, and the genre could be a tag as well.

    This would allow the author to look at all the SQL books in his library, or find that book about C# that he can't remember the name of.

  • Definitely was not expecting this much feed back. Apparenly Steve has a nack for "debates".

    Good one Steve!


    * Noel

  • As a SQL Server newbie I have to say that this conversation is great. This is a great thread and this article idea is great.

    Thanks!

  • 2 comments:

    - identity field makes a good primary key because it makes indexes and joins smaller but, if it is the primary key, there should be another unique key to preserve data integrity. If you just have an identity, you can enter the same name and it will always be unique so you still need a 'validation' key.

    - bit fields are bad. i've been bitten by them in the past and avoid them now. It's too easy to take something that should be True or False and find something else. As an example, I worked in a finance firm and we had a bit field on Living. Pretty simple, alive or dead. Right? Nope. We had to add a field for unknown if we couldn't contact the person any more. A whole new field. It's just not worth saving those 7 bits.

  • don_goodman (11/20/2007)


    ...

    7. Store your dates in varchar columns; 20 or 25 elements will work. It eliminates the conflicts between the different programming languages

    ...

    Worst advice ever! That's a real rookie mistake. Uses 4 times as much space, and allows you to enter invalid dates, among other problems.

    If a particular programming language cannot support SQL Server datetimes, you can convert it in the query.

  • "book" v "books"... hmmm.

    I've just been watching a heated thread in another group about naming conventions. In the past, I've tended to agree with you, that it's all about what's in a row, hence singular. However, there are others who are of the opinion that it's all about set theory, and that a table contains a set of entities, hence plural.

    SQL is ALL about set theory, so the latter opinion is starting to make sense to me.

    However, I've always liked to be able to use a singular name for entity table, and plurals for data tables and many to many tables. e.g. Book, Book_Reviews, Books_Authors. To me, that's more useful, as the name of the table supports its use, and improves quality by increasing the self-documenting nature of the database.

    Anyone else's take on this...?

  • Regarding table names, I now prefer singular but can live with plurals ... the main thing is consistency. Whichever you choose, use it thru-out, without regard to whether it's an "entity table", intersection table, etc..

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • Andrew_Webster (11/20/2007)


    "book" v "books"... hmmm.

    Anyone else's take on this...?

    As if we don't have enough to discuss in this thread! 😀

    I go with Books, BookReviews and BookAuthors.

    This is a style issue and there is no single right answer, so long as you have a style and you stick to it. I prefer this style because I find it more natural, but I'm comfortable with most of the usual other styles.

    Ryan Randall

    Solutions are easy. Understanding the problem, now, that's the hard part.

  • I will agree with consistancy. Unfortunately, I have seen where one can be consistant, but still end up with some exceptions. I personally would try and stay away from using reserved words for one. Do you really want a table about tables called TABLE?

  • Hmm, but happens if/when your requirements change and a table that was a "prime" table becomes an intersection table, or vice versa? Do you go back thru and rename the table everywhere it appears??

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

Viewing 15 posts - 46 through 60 (of 96 total)

You must be logged in to reply to this topic. Login to reply