Mitigatng Circumstances for a Heap ?

Question

Post reply

Mitigatng Circumstances for a Heap ?

Simon_L

SSCarpal Tunnel

Points: 4669
More actions
November 5, 2008 at 7:29 am

#128503

Under what circumstances could you justify/allow a table to be a heap ? a temporary dumping ground that was written to far more than it was read from for example ? Or is there no reason at all why a table shouldn't have at least one index ?

thanks ~simon

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply

Ray Herring SSCertifiable Points: 5533 More actions · Answer 1

IMHO every physical table should have a clustered index. It is the absence of a clustered index that makes a table a heap. If there is no suitable (i.e., unique, seldom changing, small size) column then I add an Integer or Big Integer column with the Identity property enabled and use that column as the clustered index.

Note, the clustered index does not have to be the primary key but also the Natural Primary key does not have to be used as the actual Primary Key of the table. You can specify that the added column is the actual primary key and then add a Unique constraint for the Natural Key. This is useful when the natural key is an unwieldy combination of columns, may change frequently, is not naturally incrementing, etc.

Specifying a Unique index on the Natural Key means you can still define Foreign Keys that reference the column(s) and you can join on them as well. Using the Int/BigInt as the Primary Key and Clustered Index saves space in other indexes, saves space in the base table (it is used more efficiently), and probably improves performance on inserts.

YMMV 🙂

CrazyMan SSChampion Points: 12560 More actions · Answer 2

CrazyMan

SSChampion

Points: 12560

November 5, 2008 at 8:48 am

#894306

Hi Simon

It entirely depends upon the table purpose 😀

Simon_L SSCarpal Tunnel Points: 4669 More actions · Answer 3

Simon_L

SSCarpal Tunnel

Points: 4669

November 5, 2008 at 8:54 am

#894312

hiyah

so what purpose would you say yes its acceptable to be a heap ? 🙂

Gail Shaw SSC Guru Points: 1004494 More actions · Answer 4

Simon_L (11/5/2008)

a temporary dumping ground that was written to far more than it was read from for example ?

Something that's written to frequently is a bad candidate for a heap because of the forwarding pointers that a heap can have. forwarding pointer occurs when a row is updated and doesn't fit on the page any longer. The row is moved and a pointer is left in its place showing where the row has moved to.

The nice thing is that it means the nonclustered indexes don't have to be updated when the row moves, the downside is that it means to read that row requires 2 random IOs, which, on a large table with lots of forwarding pointers is horrendously expensive in terms of IO.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 5

Simon_L (11/5/2008)

Under what circumstances could you justify/allow a table to be a heap ? a temporary dumping ground that was written to far more than it was read from for example ? Or is there no reason at all why a table shouldn't have at least one index ?

thanks ~simon

I'm a great believer that most tables must have a PK and a clustered index on something temporal... but the real answer is "It Depends". If it's just a staging table for imported data, and depending on how you're using it, it's sometimes better to have it with no clustered index or a PK...

I also have HUGE "nearly" online archive tables... all are heaps... none have indexes... they are there just to have the data available for emergencies and getting the data normally isn't an "instant" requirement of archived data.

Ok... why do I do that? Like I said, these tables are bloody huge... keeping indexes on them for every query that someone may want to use someday (maybe, once a year) is just too expensive disk-size-wise. I saved nearly 100GB just by dropping the original indexes from the archive tables.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

GabyYYZ SSCertifiable Points: 7913 More actions · Answer 6

Archive tables with little or no frequent access afterwards are probably the only time I'd avoid an index. Should I have to retrieve data from it in some manner, I could create a temporary clustered index based on the query I'm going to run, do my retrieval, and drop the index immediately (don't forget shrinkdatabase if space is getting tight).

Also, especially with archive tables that developers have forgetten about but everyone is scared to drop, even though space is getting tight, you may have to forego any index to reduce space usage.

Gaby
________________________________________________________________
"In theory, theory and practice are the same. In practice, they are not."
- Albert Einstein

SQLBOT SSCrazy Eights Points: 8014 More actions · Answer 7

Gaby A. (11/7/2008)

Archive tables with little or no frequent access afterwards are probably the only time I'd avoid an index. Should I have to retrieve data from it in some manner, I could create a temporary clustered index based on the query I'm going to run, do my retrieval, and drop the index immediately (don't forget shrinkdatabase if space is getting tight).

Also, especially with archive tables that developers have forgetten about but everyone is scared to drop, even though space is getting tight, you may have to forego any index to reduce space usage.

even then a clustered index doesn't hurt because it doesn't take up extra space.

~BOT

Craig Outcalt

Tips for new DBAs: http://www.sqlservercentral.com/articles/Career/64632
My other articles: http://www.sqlservercentral.com/Authors/Articles/Craig_Outcalt/560258

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 8

SQLBOT (11/7/2008)

even then a clustered index doesn't hurt because it doesn't take up extra space.

~BOT

Any bet's on that? 😉

USE TempDB



GO

--===== Create and populate a 1,000,000 row test table.

     -- Column "RowNum" has a range of 1 to 1,000,000 unique numbers

     -- Column "SomeInt" has a range of 1 to 50,000 non-unique numbers

     -- Column "SomeLetters2" has a range of "AA" to "ZZ" non-unique 2 character strings

     -- Column "SomeMoney has a range of 0.0000 to 99.9999 non-unique numbers

     -- Column "SomeDate" has a range of  >=01/01/2000 and <01/01/2010 non-unique date/times

     -- Column "SomeCSV" contains 'Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10'

     --        for all rows.

     -- Column "SomeHex12" contains 12 random hex characters (ie, 0-9,A-F)

     -- Jeff Moden

 SELECT TOP 1000000

        RowNum       = IDENTITY(INT,1,1),

        SomeInt      = ABS(CHECKSUM(NEWID()))%50000+1,

        SomeLetters2 = CHAR(ABS(CHECKSUM(NEWID()))%26+65)

                     + CHAR(ABS(CHECKSUM(NEWID()))%26+65),

        SomeCSV      = CAST('Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10' AS VARCHAR(80)),

        SomeMoney    = CAST(ABS(CHECKSUM(NEWID()))%10000 /100.0 AS MONEY),

        SomeDate     = CAST(RAND(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME),

        SomeHex12    = RIGHT(NEWID(),12)

   INTO dbo.JBMTest

   FROM Master.dbo.SysColumns t1

  CROSS JOIN Master.dbo.SysColumns t2 

--===== Measure the table size before adding a clustered index

   DBCC UPDATEUSAGE (0,'dbo.JBMTest')

   EXEC dbo.sp_SpaceUsed 'dbo.JBMTest'

--===== Measure the table size after adding a clustered index

 CREATE CLUSTERED INDEX IXC_JBMTest_RowNum ON dbo.JBMTest (RowNum)

   DBCC UPDATEUSAGE (0,'dbo.JBMTest')

   EXEC dbo.sp_SpaceUsed 'dbo.JBMTest'

DROP TABLE dbo.JBMTest

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

TimothyAWiseman SSCrazy Eights Points: 8819 More actions · Answer 9

Ok, Jeff proved that the table is larger with an index than without, but why would that be true of a clustered index? Doesn't a clustered index primarily order the data already existing by the key and not create new data to be stored?

And to answer the original question, I often leave small tables in a heap that are created as temporary storage places for imported data. That data then gets beaten into shape and validated before being put into the (indexed and partially normalized) primary tables. Then the import tables are dropped.

---
Timothy A Wiseman
SQL Blog: http://timothyawiseman.wordpress.com/

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 10

Because, contrary to popular belief, Clustered Indexes are NOT totally contained in the data at the Leaf level... there is an Intermediate Level containing index rows AND a Root Level, also containing index rows. The extra bytes come from them. Lookup "Clustered Indexes, Architecture" in Books Online to see more.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

TimothyAWiseman SSCrazy Eights Points: 8819 More actions · Answer 11

TimothyAWiseman

SSCrazy Eights

Points: 8819

November 8, 2008 at 12:53 am

#895920

That makes complete sense. Thanks.

---
Timothy A Wiseman
SQL Blog: http://timothyawiseman.wordpress.com/

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 12

You bet... thanks for the feedback, Timothy.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions