Unique Indexes Are Code; Non-Unique Indexes Are Data

  • Awesome article Daniel! Guys on the leading edge always get bashed by the sheeple detail workers. Personally I can't wait for the Singularity when computers become smart enough to not bother us with these "waste of a life" details on indexing.

    I virtually always implement dmv suggested indexes differently than recommended...because I know more about the monthly jobs and critical processes than sql server does. The DMVs are another piece of the puzzle to help us out. Thanks for trying to move the industry forward, maybe not real world but at least academically.

  • Thank you for this article. As a developer, I get slammed by management when the database I deployed runs slow in production. "Didn't you test it!!!!" I forget that testing the indexes in development is unlikely to replicate the issues encountered in production.

    Russel Loski, MCSE Business Intelligence, Data Platform

  • Awesome article; the suggestions should really be taken into account.

  • The usage of unique indexes instead of declarative primary key constraints is just an old way of implementing entity integrity and should be avoided/forbidden if it's not for backwards code compatibility.

    It has no real advantages using a unigue index instead of a primary key constraint based on a non nullable surrogate key. Instead you lose the possibility to implement foreign keys which has huge advantages and those claiming that foreign keys have a negative impact on performance are either non experienced or are maintaining a badly structured database. Referential integrity should be implemented in the database or a middle tier and never in the client code.

    Codd told us how to design a relational database (Each and every attribute should depend on the key, the whole key and nothing but the key, so help me Codd) and it's the only correct way to design an OLTP relational database. I did it in three systems 17 years ago and they are still working perfectly with almost no structural changes.

    The usage of surrogate keys is strongly recommended too instead of natural keys cause they are much more stable. Then you dont

    Then we have clustered index that can significally improve performance when used correctly based an an application analysis...

    About database constraints... Using a relational database without implementing referential integrity is like using a Rolls Royce for plowing or using the database as an incredibly expensive card register - an unlimited waste of money.

    Finally.... programmers should not design a relational database. They underestimate the cost of bad design and have a much too narrow view on the database - they only observe 5-10 tables at the same time instead of 100-200 in a normal as well as normalized database.

    The main reason of bad performance is bad database design... - 95-100% of all performance problems depends on the design...

    Don't use SQL to design a database - instead use a graphical tool and learn the DDL later...

  • I'm sorry, but I stopped reading when I read that non-unique indexes "... do not affect the database's behavior at all; they only affect its performance"

    If you believe that performance is not an important (frequently critical) part of a system's behavior, I would not want you working on any of my systems. I'm not worried about making systems as fast as possible, but they must be fast enough (for the users). Sometimes that comes down to things like non-unique indexes.

    Too bad, because I have the feeling the article had some good ideas to express, but I couldn't get past the aforementioned (IMNSHO) wrong-headedness.

  • aalcala (5/6/2016)


    I'm sorry, but I stopped reading when I read that non-unique indexes "... do not affect the database's behavior at all; they only affect its performance"

    That is very unfortunate. Whether you agree with the specifics of the article, I think that he makes a very interesting case that when it comes to non unique indexes developers cannot effectively create the indexes that will improve performance on production.

    Russel Loski, MCSE Business Intelligence, Data Platform

  • RichB (5/6/2016)


    What utter flamebait.

    Congratulations.

    As to people who are saying it's the DBA's job to index - Developers can't possibly be expected to grasp the subtleties... hogwash.

    No, it's not. Developers do not have the knowledge or skills to create or maintain indexes, period. Such as:

    What column(s), in what order, should key the clustering index (by far the most critical index on any table)?

    What should the fill factor be?

    Which filegroup should the index be stored in?

    Should it be a filtered index?

    Which index(es) should be combined?

    Should the index be compressed, and if so, row or page?

    Should the index be partitioned?

    etc..

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • orjan.franzen (5/6/2016)

    The usage of unique indexes instead of declarative primary key constraints is just an old way of implementing entity integrity and should be avoided/forbidden if it's not for backwards code compatibility.

    While in theory that's true, in practice it can make it more difficult and costly to make index changes. A unique clustered index can be rebuilt using "DROP_EXISTING = ON" with big performance gains. A unique constraint must be dropped and recreated, at potentially a vastly bigger I/O cost.

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • Thanks for sharing your thoughts Daniel, but I really can't agree with index creation / maintenance being outside the purview of development and in the care of admins only. I also cannot agree with each copy of the database having its own indexes. If you use something like SSDT, this will potentially cause schema altering deployments every day, and nothing will be in sync. Yes of course all databases have slightly varying copies of data, but most of the time this is not a big enough difference to warrant separate indexes. Developers don't create indexes in a vaccuum. They have a good idea of the data distribution on production, and create/modify indexes based on what is in production. Creating indexes for data in other environments that does not conform to a production-like spread just does not make any sense and accomplishes nothing. It is like saying "but it works on my computer". Who cares? It needs to work in production. And anything that is upstream from production should have data that mimics production as much as possible, otherwise what purpose does it serve?

    Hakim Ali
    www.sqlzen.com

  • Creating indexes for data in other environments that does not conform to a production-like spread just does not make any sense and accomplishes nothing.

    Yes, it does - it depends on the usage of the data. If the environment is for reporting, the report queries may have entirely different needs than the UI application has. Same thing for analysis.

    John Hick

  • Martin Bastable (5/6/2016)


    Question: For your particular use case/situation and with your available resources, have you found your solution to be positive and successful, and in any way measurable as such?

    Absolutely. A few years ago, there was a period of several months when I spent 99% of my time looking at execution plans and tracking down query performance problems. Now I hardly ever have to do that at all. I know that's not really an "official" metric to measure by, but it's absolutely the first and most obvious one that pops into my head.

    Also -- like I've said before -- I didn't even trust my own process at first; I had it make recommendations only, and it logged its recommendations in a table. Even now, when it automatically applies the indexes for me, it still logs them where I can see what it has done. When I first wrote it, I spent a LOT of time looking at those recommendations to make sure they made sense. There were quite a lot of them, and I did frequently catch the process creating and dropping the same indexes over and over again. That's where the customization and fine-tuning came in. Eventually, it got to the point where it was creating and dropping indexes only rarely, and it wasn't repeating itself, because I finally got it smart enough to only do the things that really needed doing. So, if you measure by the number of indexes created or dropped per day or per week, or by the number of indexes it has to recreate after being dropped, or by the number of unused indexes that it drops after being created, it has made things better; all of those activities have slowed way down since I started the process.

  • Everyone, please notice that I originally wrote and published this article over a year ago. I got a lot of feedback at the time about the "tone" of the article, I responded to that feedback, and all of those comments are still here. If I had been given the chance to edit the article before it got republished, I would have changed the tone, making it less of a rant and more of a technical discussion about the dynamic management views that SQL Server provides and how we can make good use of them. However, SQL Server Central republished it today exactly as it was written back in 2014. So please, before you reply, consider that the focus would have been quite different if I had rewritten it recently; pretend that the article is all about how to easily figure out what indexes might be good to create and what indexes might be OK to drop, and just don't respond to the rant or the tone.

  • It is worth mentioning that, since this article was first published, SQL Azure has received a lot of Microsoft's attention. We've started using it ourselves. Notice that Microsoft has implemented an Index Advisor for Azure SQL databases, which does EXACTLY THE SAME THING that my process does -- it looks at query performance history, recommends indexes to be created and dropped, and optionally implements the changes for you. So my idea couldn't have been all that bad if Microsoft did it too and is aggressively marketing it as a good feature to use. I personally suspect they got the idea from me. :satisfied:

    On the flip side, it is ALSO worth mentioning that, both in SQL Server and in SQL Azure, I have seen index recommendations show up in the missing index views even when the indexes already exist, and I have seen the index usage statistics show that those indexes are not being used even though the server thinks they are needed. In those cases, Microsoft's Index Advisor will recommend or create duplicate indexes; my process used to, as well, until I added code to check to see if an index exists before creating another one. I've brought this to Microsoft's attention, and they say they are currently investigating why the necessary indexes aren't being used. In the mean time, you are all correct to be skeptical about SQL Server's "missing index" recommendations. Treat them as recommendations, don't blindly implement all of them but look at them thoughtfully first, and you will find that those missing index views are good enough to save you some time and effort troubleshooting query performance. They are not perfect, but they are worth looking at.

  • Hehe, expected reactions...; "Thanks for trying to move the industry forward, maybe not real world but at least academically."

    Apart from doing a lot of the business analysis and designing thre integrated databases 1998-99 I was deeply involved in both server programming and production for more than 13 years, so I think I know what I'm talking about - data quality, no performance problems and extremely stable databases. And satisfied customers 🙂

    It's not an academical dream to implement referential integrity in a normalized database, it is reality and has been in more than two decades. Those claiming it's an academical dream should turn the pillow and dream on...

    But of course... it's much easier if you instead of the classical DBA have an "application DBA" who knows the business rules and can do some coding of procedures and triggers instead of backup and hardware maintenance - for sure.

  • Having to drop an unique index to increase performance can be done easily with a generated script wether it's a primary key with clustered index or a unique clustered index.

    However rebuilding unique clustered indexes is a clear indication of bad physical design and can be avoided by analysing structure, volumes and search needs correctly from the beginning...

Viewing 15 posts - 31 through 45 (of 52 total)

You must be logged in to reply to this topic. Login to reply