Index on nullable columns

  • Guys,

    I have amount field which is nullable - it also has a non clustered index on it.

    Is better to make the amount field NOT NULL with default constraint as '0' so that the index performance on the column improves.

    More specifially if I have 3 million rows table out of which for the amount indexed column has 1 million rows as null will the performance improve if the column is altered to NOT NULL field with default constraint 0.

    Thanks

  • I wouldn't think so. As far as index selectivity's concerned, 1 million nulls or 1 million 0s selects much the same.

    I'm not sure about nulls, but I think that a null is not stored, so your index may be smaller with nulls than 0s. Not much though, NC indexes are usually quite small anyway.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • The other consideration for this is any calculations/aggregates you may use against this column. 0 is much different than Null. Check out this article before making this change, Gotcha! SQL Aggregate Functions and NULL[/url]

    Jack Corbett
    Consultant - Straight Path Solutions
    Check out these links on how to get faster and more accurate answers:
    Forum Etiquette: How to post data/code on a forum to get the best help
    Need an Answer? Actually, No ... You Need a Question

  • I just did a test on a 3-million row table.

    Added a varchar(100) column, plugged in some data in all of it (first names, if that matters), and then indexed it.

    The index ended up at 62.226 Meg.

    Then I nulled 1-million rows. The index shrank to 57.164 Meg.

    I then replaced all the nulls with zero-length strings, and the index stayed the same size.

    I then replaced all the zero-length strings with '1', and the index grew to 62.093 Meg.

    Doesn't say anything about the selectivity, there are other tests for that, but thought I'd test what Gail was surmising on nulls in indexes.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Agreed that storage required for the index is less.

    Does the index perform better on nullable column or not null column with default constraint? I always thought it is better to have indexed column wiht default value

    Thanks

  • There shouldn't be any difference in performance.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • There isn't any performance difference.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I saw this and wondered how nulls were stored in an index, so I had a play and - it seems that having a column defined as nullable ADDS 3 bytes to the length of each index row.

    I took a 3 column table and added indexes:

    CREATE TABLE TEST (

    INC INT PRIMARY KEY,

    VAL1 INT NULL,

    VAL2 INT NULL,

    VAL3 INT NOT NULL

    );

    CREATE INDEX IX_TEST_VAL1 ON TEST (VAL1);

    CREATE INDEX IX_TEST_VAL2 ON TEST (VAL2);

    CREATE INDEX IX_TEST_VAL3 ON TEST (VAL3);

    Then added a million rows, INC incrementing by 1 each time, VAL1 always null, VAL2 always 0, VAL3 always 0.

    Looking at dm_db_partition_stats, used and reserved page counts for the indexes on IX_TEST_VAL1 and IX_TEST_VAL2 were identical ... so no storage difference if you use null or 0. The page counts for IX_TEST_VAL3 were significantly lower (~30%).

    DBCC PAGE shows the following:

    IX_TEST_VAL1

    Slot 1, Offset 0x6c, Length 12, DumpStyle BYTE

    Record Type = INDEX_RECORD Record Attributes = NULL_BITMAP

    Memory Dump @0x32C9C06C

    00000000: 16000000 3b010000 000200fd ††††††††††....;.......

    IX_TEST_VAL2

    Slot 1, Offset 0x6c, Length 12, DumpStyle BYTE

    Record Type = INDEX_RECORD Record Attributes = NULL_BITMAP

    Memory Dump @0x33A1C06C

    00000000: 16000000 00010000 000200fc ††††††††††............

    IX_TEST_VAL3

    Slot 1, Offset 0x69, Length 9, DumpStyle BYTE

    Record Type = INDEX_RECORD Record Attributes =

    Memory Dump @0x3391C069

    00000000: 06000000 00010000 00†††††††††††††††††.........

    So for ints, it appears to be better to define the field as non-null from a storage perspective. I would guess that this holds true for all fixed length columns.

    So - am I missing something? It seems odd that an index can't use some kind of efficient null bitmap.

    Matt.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply