Question regarding Index.

  • A nonclustered index is analogous to an index in a textbook. The data is stored in one place, the index in another, with pointers to the storage location of the data. The items in the index are stored in the order of the index key values, but the information in the table is stored in a different order (which can be dictated by a clustered index).

    A clustered index determines the physical order of data in a table. A clustered index is analogous to a telephone directory, which arranges data by last name.

    After having read several articales on indexs i think I have understood how they work. But I think I still have a doubt which could not be clarified from any of the articles i have read

    What I understand is Non clustered index take up a separate space in memory(tempdb) to store the index key values and pointers to actual data in sorted order.

    But Clustered index is nothing but the table itself in sorted order based on index key . Hence it does not take up any additional space. Please correct me if a am wrong. Any further information is welcome..

    Do clustered indexes also take up space and how??

  • The nonclustered index does indeed take up disc space, but not really in tempdb.

    While actually building the index it may use space in tempdb, but the finihed index is stored in the actual database where the table resides.

    As for space taken by the Clusterd index, you are correct it is largely the data pages of the table itself. There will be a few (normally a very few) index pages which act as a tree structure (look up "btree") to point to the data pages.

    How many of those there will be mostly depends on how wide the index key is.

    If your index key is only 4 bytes (int) and your table row widt is say 2kb per row then your index page overhead will be tiny, but lets say we get crazy and create a clustered index on massive char column, the index overhead would be much larger.

    The descriptions of clustered indexes in "books on line" describes this in a lot more detail, but hopefully the above will get you started.

    Mike

  • http://qa.sqlservercentral.com/articles/Indexing/68439/

    http://qa.sqlservercentral.com/articles/Indexing/68563/

    http://qa.sqlservercentral.com/articles/Indexing/68636/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • You need to read some good literature about indexes. It's difficult to put it in few sentences, but I'll try.

    Every index consumes space and it has nothing to do with tempdb (tempdb is temporary storage db for temporary operations, index is not a temporary object).

    It consumes space because it stores duplicate of some columns from a table, in a physical structure called binary-tree which is sorted structure.

    For index you define KEY columns and optionally INCLUDED columns.

    KEY columns are stored in non-leaf nodes. INCLUDED columns are stored just in leaf nodes.

    Nonclustered index in leaf nodes has, besides included columns, a row identifier that points to a row physical position in a table (or is a logical id, PK, in case of nonclustered indexes on a clustered table). So when you need a column that is not among key or included columns - it can jump directly to the place in the table where the row is and fetch a value for such a column that is not contained in index.

    Clustered index has all the non-key columns in leaf nodes, like a nonclustered index with all non-key columns listed in INCLUDE clause. Because all the columns are in index, there is no need for the heap table physical structure to exist, because it would never be used.

    Number of rows in the (unfiltered) index is exactly the same as number of rows in a table. But it can contain just a subset of table columns, and be smaller for a full scan if you need just that columns.

    Filtered index can contain fewer rows than a table, but that's another story.

    Indexes can also be partitioned, with the same or different partition schema than the table or other indexes on that table, but that is also another story.

    You can see number of data pages in a clustered index, versus number of key pages with the tool sqlxdetails.com I wrote, and many other details about the indexes.

    _____________________________________________________
    Microsoft Certified Master: SQL Server 2008
    XDetails Addin - for SQL Developers
    blog.sqlxdetails.com - Transaction log myths
  • Vedran Kesegic (4/19/2012)


    It consumes space because it stores duplicate of some columns from a table, in a physical structure called binary-tree which is sorted structure.

    It's a b-tree, not a binary tree. To be pedantic, it's a b+ tree.

    Nonclustered index in leaf nodes has, besides included columns, a row identifier that points to a row physical position in a table (or is a logical id, PK, in case of nonclustered indexes on a clustered table). So when you need a column that is not among key or included columns - it can jump directly to the place in the table where the row is and fetch a value for such a column that is not contained in index.

    Not the primary key. In the case of a nonclustered index on a table with a clustered index, the 'pointer' is the clustered index key and a lookup is executed as a clustered index seek (it's not a direct navigation to the row)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (4/19/2012)


    Not the primary key. In the case of a nonclustered index on a table with a clustered index, the 'pointer' is the clustered index key

    Dear Gilla, I tried to be as short as possible. I read some of your articles and very respect your work. Please don't mind if I say that for non-clustered indexes on a non-unique clustered tables the 'pointer' it is not the clustered index key. Additional column is internally generated to make it unique. But, with all those details, how could one explain the basics, to look simple, and without growing to a full article?

    _____________________________________________________
    Microsoft Certified Master: SQL Server 2008
    XDetails Addin - for SQL Developers
    blog.sqlxdetails.com - Transaction log myths
  • Vedran Kesegic (4/19/2012)


    Please don't mind if I say that for non-clustered indexes on a non-unique clustered tables the 'pointer' it is not the clustered index key. Additional column is internally generated to make it unique.

    It is the clustered index key. Nothing to do with the uniqueness of the nonclustered index. A non-unique clustered index gets an additional internal column added to the key when the clustered index is created. So the clustered index key is what is specified + the uniqifier. The clustered index key (which is the specified columns + uniquifier) is then used as the 'pointer' in any nonclustered index.

    I don't want to argue finicky details, just pointing out things that were incorrectly stated.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (4/19/2012)


    The clustered index key (which is the specified columns + uniquifier) is then used as the 'pointer' in any nonclustered index.

    Exactly. That's what I tried to explain. Not just the key is the pointer, but key broadened with the uniquifier.

    _____________________________________________________
    Microsoft Certified Master: SQL Server 2008
    XDetails Addin - for SQL Developers
    blog.sqlxdetails.com - Transaction log myths

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply