ADD, ALTER COLUMNS on large tables

  • I have to change four columns from CHAR(10) to VARCHAR(10) on a table with 7 million rows (long rows).  None of the columns are indexed.

    I have to add an integer column to a table with about 15 million rows.  Again, no index changes.

    In both cases, when I run a script to alter the columns, each table takes about four hours to process. 

    Are there better options such as unloading the table first or minimizing logging?

  • Tom, I had a fairly lengthy response written but the preview clobbered it. To summarize...

    Are you doing the modifications (or generating your script) through EM? EM will create a new table, copy the data, delete the old table, and then rename the new one. With big tables, it can really slam the disks. ALTER TABLE t ALTER COLUMN ... will be faster for adding the int column. It'll be hit or miss with the char-to-varchar conversion since you can only alter one column at a time.

    Datafile & logfile growth operations are a possible cause for the slow performance. Preallocate to a sufficient size. Disable autoshrink. Don't set a percentage for the filegrowth, use fixed values. On fast arrays, 100MB is allocated plenty fast.

    Via the EM gui or ALTER TABLE t-sql, the bulk of the work is going to be performed within a single implicit transaction, so changing logging methods won't benefit much. Are your data & log files on separate arrays?

    How wide are your rows? Under ~4kb/row, page splits are likely, which can dramatically increase the storage requirement of your table. Again, preallocate the files beforehand.

    Are you providing a default value for the new int column? If so, can you leave it null? Doing so should dramatically improve the speed of an ALTER TABLE t ADD ...

    truncate..alter..bulk insert may improve things. Definitly change from FULL to simple or bulk logged recovery if you do this.

  • If you design the table and alter it through SQL Enterprise mangler, it will recreate the table.  Does the table alteration requires data to change?   Or is it a metadata-only change? You could be doing a lot of work that requires a lot of time.

    Some column alterations are best performed by copying the column (add a column, populate a few thousand rows at a time, drop the original column, rename the new column) .  These article on Managing schema changes are worth reading.

    http://blogs.msdn.com/sqlcat/archive/2006/03/01/541550.aspx

    http://blogs.msdn.com/sqlcat/archive/2006/03/31/566046.aspx

    Bill

  • It also depends on where you are adding columns.  If you are adding columns between existing columns, then the system needs to recreate the table and move data.  If you add a column at the end of the row, it usually happens much quicker as the table does not need to be recreated.  Based on your post, however, it appears that you may have very few rows per page as it is, so adding a new column causes data to have to move between pages as the pages split.

    In this case, you may also want to look at vertically partitioning your data if possible.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply