Altering large table

  • We have a table which has about 2,500,000 rows & daily in grows by about 10,000. This table already has 25 columns in it. Now for specific reason we need to add 15 columns to it. Now my question is that...

    1. Will performance of queries (Insert,Update,Delete,Select) affect if I will add these 15 columns to such large table??

    2. When we are adding any columns to such large table, where those columns actually get added in backend?? Means when we add new columns then will those columns gets added to existing data pages or creates new datapages with link to those pages?? Because I have observed that while using alter table command to add 15 columns to such large table(during testing) it takes few seconds only, so I think table doesn't get rearranged it's rows (data pages) to incorporate new columns. Is it so??

    3 Is there any differences in structure of table with 35 columns(in my e.g.) created at the beginning and the structure of table having 25 columns with 2,500,000 rows & then 15 columns added to it??

    Thanks in advance.

  • Hi

    i will try to answer your questions...

    1) - Yes . Performance of queries does get affected. Make sure that you have proper indexes and you retrieve only those rows and columns that you requre , NOT more.

    3) I think whenever new columns are added specially as many as 15 lot of page breaks can occur than normal.

    Of course this depends on the volume of data that is being manipulated but adding more columns later on can lead of lot of page breaks. Also keep in mind the fill factor. If requried adjust it to reflect the new columns added.

    In my opinion it would be better to normalize this large table (break it into 2 or more tables). This will lead to better query performace and less maintencance issues .

    "Keep Trying"

  • Chirag (6/16/2008)


    Hi

    i will try to answer your questions...

    1) - Yes . Performance of queries does get affected. Make sure that you have proper indexes and you retrieve only those rows and columns that you requre , NOT more.

    3) I think whenever new columns are added specially as many as 15 lot of page breaks can occur than normal.

    Of course this depends on the volume of data that is being manipulated but adding more columns later on can lead of lot of page breaks. Also keep in mind the fill factor. If requried adjust it to reflect the new columns added.

    In my opinion it would be better to normalize this large table (break it into 2 or more tables). This will lead to better query performace and less maintencance issues .

    Thanks for your response...

    Now my clarification on u'r suggestions..

    We have created all necessary indexes and retrieving only those rows & columns which are actually needed.

    We have also thought about creating separate table instead of adding 15 columns. But adding columns will not fulfill our need as I think (might be wrong..if u think pls explain), as these columns get updated only once but in reporting these columns along with others from same table will be used frequenlty. If we will create separate table then also we need to perform joins. Performing join on such large table for the purpose of only reporting I think performance will affect largely.

    As you have suggested to normalize this table, after adding 15 columns it will be in 3rd normalize form(only columns count increases but those columns depends on only primary key of table).

    If in any place I m wrong pls suggest.

    Thanks.

  • To answer you question on where columns are added is a bit lengthy. To really oversimplify simplify things greatly a row on a page in SQL Sever is broken down into a number of areas:

    some information

    fixed length column data

    variable length column data

    some information

    any columns that are nullable are also considered to be variable length.

    So when you alter the table the columns are added to the system tables immediately. As for the individual rows that reside on each data page, well they are updated (columns added) when the rows are modified. That is why the ALTER statement ran so quickly. had you performed this exercise in EM it may have taken quite a while to execute since it would have created an entirely new table with all the columns and transferred the data to the newly altered table.

    RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."

  • And to expand on what Rudy has said, because the rows themselves are not affected until a modification takes place, this whole scenario changes if you add the columns NOT NULL with a default value. At that point, the rows must be modified to use the default value and your ALTER table statement will take ages on a table that size.

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • rudy komacsar (6/18/2008)


    To answer you question on where columns are added is a bit lengthy. To really oversimplify simplify things greatly a row on a page in SQL Sever is broken down into a number of areas:

    some information

    fixed length column data

    variable length column data

    some information

    any columns that are nullable are also considered to be variable length.

    So when you alter the table the columns are added to the system tables immediately. As for the individual rows that reside on each data page, well they are updated (columns added) when the rows are modified. That is why the ALTER statement ran so quickly. had you performed this exercise in EM it may have taken quite a while to execute since it would have created an entirely new table with all the columns and transferred the data to the newly altered table.

    Thanks rudy...

    this is the actual the answer I was searching for. Would you pls explain wat is EM??IIn your replay) I don't know anything about it.

  • John Rowan (6/18/2008)


    And to expand on what Rudy has said, because the rows themselves are not affected until a modification takes place, this whole scenario changes if you add the columns NOT NULL with a default value. At that point, the rows must be modified to use the default value and your ALTER table statement will take ages on a table that size.

    Thanks for u'r reply. It will helpful to resolve our problem.

  • EM = SQL Server Enterprise Manager ...

    you will also see QA in many posts as well.

    QA = SQL Server Query Analyzer.

    RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply