general indexing question

  • Let's assume you have a table with columns that have a very low selectivity by themselves (like columnA is a date, and you add several thousand records/ day, the others are numbers that also have a relatively low selectivity. The only unique combination is columnA/columnB/columnC (and maybe, D). Your queries are using columnA and one or 2 of columns B, C and D (but not together). How do you pick your clustered index? Do you pick a composite clustered index over columnA/B/C just to have a clustered index and create nonclustered indexes for the other one or two columns? If you go this way and delete records for the oldest date and add new records daily, the other non-clustered indexes are automatically (very) fragmented. Or, do you just create non-clustered indexes for all (3) columns you need and leave it as a heap?

    No other options like PK, partitioning, indexed views are possible.

  • Personally, I'd look at how your restrictions are actually working. The reason for this is because date in this case as the leading edge is probably not your best choice unless you pick a particular day when you're looking for records.

    Once you hit a 'range' during an index seek, it stops moving to deeper levels of the index.

    So, I'd evaluate what the primary access method(s) are, figure out call rates, and work towards that being my clustered index, and then create covering indexes for the other call types.

    However, that's VERY generic. There are probably a hundred approaches to this for optimization and each one lies in the details, not the generic overview, of what's happening at the data selection level.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Cluster by columnA (date/datetime) only.

    That will keep the clustering key short but still restrict the number of rows a query must read and evaluate.

    Sometimes additional column(s) are appropriate for the clus index, but you need someone with expertise to review the table and index usage, missing index stats, etc., to determine that.

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • That's what I did, implement datetime as the clustered index and deal with fragmentation maybe on a daily (nightly) basis. Goes along with what Craig said as well, since I have queries on that datetime, it is worth making a clustered index out of it.

    To generalize, don't create a clustered index just for uniqueness (high selectivity), look at your queries first...

    Thank you, both!

  • Typically non-clustered indexes are not needed here, since you specify columnA in (virtually) all queries, in a reasonably narrow range.

    Of course sometimes they do help specific queries, so you could try them and then look at index_usage_stats to see if they are, in fact, being used.

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply