Clustered INdex Strategy

  • SQL Server 2K

    I trying to form a basic strategy for creating Clustered indexes so I was hoping to get some input. I've been reading through various resources on the net but I haven't come across anything that answers these specific questions.

    I have a table called ORGS which contains a list of all "organizations" in a database. I have a similar PEOPLE table which contains names of people and other person specific information.

    The ORGS table contains a surrogate primary key (int) and a name field, 50 characters wide and a few other identifying field that are not accessed too often. 99.9% of the time I use a query where I need the name of an organisation I am JOINing on the PK field and including the NAME and PK field in the result set.

    While this table does change, I would say no more than 5 records per day (of say 1000) are updated (I/U/D). So I was thinking that if I had a CLUSTERED index on the PK + NAME fields, SQL Server would get a lot out of that index.

    So now for the question:

    1) Is the above a true statement and is it a good strategy for tables like that?

    2) Since I obviously already have a UNIQUE NON-CLUSTERED index on my PK field, should I include that field in the CLUSTERED index or would that be redundant

    3) What about the PERSON table a mentioned, similar circumstances except I have multiple fields I access FirstName,LastName,MiddileName,PRefix,Suffix - would I include those in the clustered index or is that just basically duplicating the table in the index file

    Thanks for any and all input, as you can see I'm still trying to get a handle on indexes.

    Also if anyone can recommend good resource for indexing tips etc I would appreciate a pinter to them


    Much Thanks,

    Steve Dingle

  • My basic strategy is to avoid using a Clustered Index as a PK if the table suffers a large number of Inserts, Deletes, or Updates to the PK columns because the data get's "reordered".  Same goes for using a Clustered Index anywhere in the table... if the table is highly transactional or the columns in the index are going to be updated a lot, I just won't even use a Clustered Index on highly transactional tables.

    Even on SELECTs, I won't just assign a Clustered Index in a willy nilly fashion... I'll "collect" several of the more difficult queries in a single Query Analyzer window and let the Index Tuning Wizard suggest what the Clustered Index should be... if it doesn't recommend one, I won't create one.  Has worked very well, so far, on tables and joined queries that access millions of records.

    But that's just my humble opinion

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Hey Jeff, I agree with what you are saying about not always creating clustered indexes on highly transactional tables, but I'd like your opinion on how you keep the fragmentation to a minimum for those highly transactional tables without a clustered key?  I've heard of people occasionally adding/dropping a clustered key or dumping and loading the table, but what approach to you take?

     

    Thanks in advance.....

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • There's no single approach that fits all situations.  I've found that there are lots of different factors affecting such a decision... on 24/7 tables, I might use DBCC IndexDefrag because of it's "online" nature (backup the logs frequently, though).  If it's a 24/7 staging table, I normally keep those small so DBCC DBReIndex may work well.   If it's highly transactional in "spurts" as in a batch, I may just clear the table, drop it, and start all over.  If it's a large permanent table for a batch, I may drop some of the indexes, do the inserts/updates, and rebuild the indexes.  Or, I may do nothing but let the DBA's know what the table and indexes are and they seem to deal with it quite well between their maintenance plans and a couple of key routines that do "surgical" DBCC DBReIndex's at both the table and index level.  And, sometimes, you just have to be careful what you index... or not... I've so many times seen where people just go nuts with putting too many indexes on HTT's, or worse, combining the functionality of an HTT with that of a process table which may require a whole different set of indexes to be effective.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply