More Efficient Storage Method of a LARGE cross-reference table

  • Problem:  (Basically a “Current News” feature)

    I need an efficient way to store and retrieve data from a Notes Table for each user that logs onto our system.

    Notes are linked to “Customer” records.

    When a User logs on, they will see all "Pending" Notes that are active.  They will be able to clear the notes after they view them.

    Users will be able add to these notes, and control which other users will see them as “Pending”.

    In a simple world, I would create a cross reference table with the User ID and Note ID for each “Pending” note.

    My problem, is that I have over a million (1,000,000) Notes, and almost two thousand (2,000) users.  I am estimating that my table will grow to over 25,000,000 records.

    I was toying with changing the Cross Reference table to have the Note ID, and a text field with all of the user ID’s delimited with tilde’s.  I would then have to select from the table all records with CHARINDEX(field, ‘~’ + UserID + ‘~’) > 0.

    I think that this select will take too long.  I was also thinking of going the other route, with the User ID and the delimited Note ID’s, but we will need the ability for Supervisors to clear the notes from all other Supervisors.  At this point, I would have to find all Users that are Supervisors, and then grab each cross-reference record to remove them.

    I am looking for a more efficient method than Brute Force to store the records.  I will also need to have the storage method coded in a fairly clear and straight forward manner.

    Any thoughts on this would be happily received!

    Thank you,

    Bryan Clauss

  • CHARINDEX(field, ‘~’ + UserID + ‘~’) > 0

    That technique is sssssssssslllllllllllooooow. you can't use indexes with it. I works on small sets but you are going to get table scans. Also the length of it will constraint the design unnecessarily

    I will stick to the Normalized Association table (UserID,NoteID) and create two indexes so that you could search both ways and probably  a third unique clustered on both columns that will help with range. That table will be very long but very narrow and in the end it will be very fast.

    Just my $0.02


    * Noel

  • I agree that your associated (User/Note) table is the way to go.

    If each note is being assigned to a large number of users, perhaps a group system could be implemented. Either by creating the new concept of "groups" with manual assigments to groups, or based on other information available in the user table (location, business unit, etc) In this way you could assoicate a note to a large number of users with only one row in your assignment table. It might also be allot more usefull for your users.

    Dave

    Trainmark.com IT Training B2B Marketplace
    (Jobs for IT Instructors)

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply