Question: Will combining unique tables produce unique results?

  • Ok, Here is what may be termed a database theory question.

    If I have two or more tables with primary keys and I select all distinct elements of the primary keys and I join them appropriately will the result set be unique based on the distinct key elements?

    My gut is telling me that for inner joins this is true an example

    Table Order PK: CustomerID, OrderID

    Table OrderItem PK: CustomerID, OrderID, Line ID Note: Has a Foreign Key to product

    Table Product PK: ProductID

    My gut says that if I use inner joins to get a result set of CustomerID, OrderID, and ProductID, that these three columns could be used as a unique key or index in another table or an indexed view.

    Furthermore, my gut says that if I used outer joins, and had some nulls, that I would nolonger be guaranteed to be unique.

    I am wondering if anyone can confirm or deny my gut instincts.

    Thanks in advance, Brian

  • Hello,

    if I only take into account what you posted about the structure of tables, then the mentioned 3 columns can NOT be used as a unique key (or unique index). There is at least one missing condition : that the same product may not appear several times in the same order. This is not something you can take for granted, everything depends on how you work with the orders. In our company, it is nothing unusual that the same product appears on several lines of the same order.

    If you have a unique index/constraint on OrderItem (OrderID, ProductID), then you are right as far as I can imagine the various possibilities. Outer joins could cause problems in case some of your orders have lines with no product entered... but if there is a unique index/constraint as mentioned, then even outer joins should work OK.

    On a sidenote, I think that CustomerID in table OrderItem is superfluous and according to normal forms should not be there. As far as I know, one order is always from one customer and there is no reason to repeat the information again and again for every line. And more, since you already have it in the table Order, it is a duplication of info and you need to implement some error handling to assure that the same person/company is entered on both places. This said, I must admit, that the information system we are using (and can not influence, since we are merely users) does the same thing, i.e. stores the customer in both tables... maybe there is some special situation, where it could be useful..? Don't know..

    HTH, Vladan

  • Vladan,

    Thanks for the response.

    As far as missing information I agree with what you said, except that I do have the line number as part of the primary key for OrderItem in my example so an order may have the same product multiple times, but they are on different lines of the orders.

    Ooops my example was screwed up, you are right the three columns aren't unique, I forgot to add that line number, what I meant to say was Is CustomerID, OrderID, Line ID and ProductID unique.  Which it sounds like you agree it should be.

    In regards to your side note, it is most likely a philosophical argument over having customerID in the orderItem table, and here are some thoughts:

    1. If you want to have each customer have their own order numbers starting at 1, then the primary key of order must be customerID, orderID, and then since orderID is not unique across the system, you must carry the customerID to the orderItem table and I believe this would meet 3rd normal form.

    2. Presuming that you do have unique orderID across the system, then for performance you may want to have the customerID repeated and therefore you denormalize for performance.

    One other note while I didn't list a Customer table, the customerID would be the primary key of that table, and yes you would have to ensure that the correct customerID was entered in both the order and orderItem table, but that is normal processing, after all I have to get the correct order number in the orderItem table.  Details of the customer (name, company, address) would be kept in the customer table and would be maintained there.

    Thanks for the help, Brian

  • Hi Brian,

    You're welcome, I think we understood each other perfectly. If you use the line number, no problem (you even don't have to include product to have a unique combination) - meaning it should work fine even with outer joins. I take for granted that an order can not have two lines with the same line number.

    In your place, I probably would avoid naming the column OrderID, if it should start at 1 for every customer. I'd prefer to have OrderID that is unique (INT IDENTITY) and has no meaning per se, and then another column OrderRefNo that could start for every customer from 1, include year or literal characters (not numbers), start every year from 1 etc. etc. - as necessary. Among other things, this allows you to join OrderItem through only 1 column, OrderID.

    The sidenote was meant really only as BTW; i know you sometimes have to denormalize to some level to make the system do what is required. Whether it is necessary in this particular place or not, that's something I obviously can't tell from the few things I know about your DB... and anyway, it's up to you (or the DB designer) which way you want to go.

  • That's cool.  I was really using the classic order problem as an example.  That isn't what we do, I was using it to explore the general question will a unique set of columns when combined with another set of unique columns always produce a unique set.

    I think it will.  It would make a great Question of the day.

    Thanks again.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply