Table Fragmentation and Avg_Page_space_used_in_percent

  • I have a fairly mature table in our database that tracks a users Last Login, we will call it UserLastLogin.

    Table definiation below:

    CREATE TABLE [dbo].[UserLastLogin](

    [ULL_ID] [bigint] IDENTITY(1,1) NOT NULL,

    [UT_ID] [tinyint] NOT NULL,

    [ULL_UserKey] [int] NOT NULL,

    [ULL_LoginTime] [datetime] NULL,

    [ULL_Count] [int] NOT NULL CONSTRAINT [DF_UserLastLogin_LastLogin_Count] DEFAULT ((0)),

    [ULL_IPAddress] [varchar](50) NULL,

    [ULL_Where] [tinyint] NULL,

    [ORG_ID] [int] NOT NULL,

    CONSTRAINT [PK_UserLastLogin] PRIMARY KEY CLUSTERED

    (

    [ULL_ID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    CREATE UNIQUE NONCLUSTERED INDEX [OrgUserKeyUserType] ON [dbo].[UserLastLogin]

    (

    [ORG_ID] ASC,

    [ULL_UserKey] ASC,

    [UT_ID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    This table tracks an application users last login which means that the first time the user logs in we will insert a row with all columns values.

    INSERT UserLastLogin (

    ULL_UserKey,

    UT_ID,

    ULL_IPAddress,

    ULL_where,

    Org_ID,

    ULL_LoginTime,

    ULL_Count

    )

    SELECT @UserKey,

    @UserType,

    @IP,

    @WHERE,

    @ORG_ID,

    getdate(),

    1

    The second time a user logs into the application we will simply update the existing row:

    UPDATE UserLastLogin

    SET ULL_Time = getdate(),

    ULL_Count = ULL_Count + 1,

    ULL_IPAddress = @IP,

    ULL_WHERE = @where

    WHERE org_id = @Org_ID

    AND ULL_UserKey = @UserKey

    AND UT_ID = @UserType

    This really is a pretty simple table but it sees a lot of updates through out the day.

    The table is failry mature at 1,738,426 rows and we are only seeing about 10,000 new rows per week, and the rest being existing row updates which can be anywhere from 7 to 10 millioin updates in a week.

    I am seeing that the clustered index and NC index fragments quickly very quickly and the space usage becomes very low which triggers index rebuild maintenance on this table every week.

    After the index rebuild I see the Fragmentation drop to less than 1% and the avg_page_space_used_in_percent go up to 99%.

    By the end of the week the Fragmentation goes up to 42% and avg_page_space_used_in_percent goes to 73%.

    Do to the nature of this table, I do not understand why we would continue to see fragmentation and avg_page_space_used_in_percent degrade this fast.

    There are not many variable length columns in the table:

    LoginTime is NULLable but it is always valued

    IPAddress is a VARCHAR but the length is usually consistent unless it gets valued with a NULL (which is possible). But I would think that once it is valued that it would achive the maximum row size.

  • Before anyone replies to this...I have answered my own question.

    I entered my whole post and the answer hit me upside the head as I was typing the last sentence.

    For education reasons, I thought it would be nice to post it anyway.

    The answer lies in the IP Address column.

    At any given time, 10 to 20% of these values in the table are null.

    This is a very volatile column moving back and forth between NULL and an actual value.

    When I rebuild the index, up to 20% of the IP Address column values are null and the index is built with a fillfactor of 0 (100 for the most part).

    Through out the week, these values will change from NULL to valued and back again multiple times.

    This causes my issue.

    My assumption was that once it was valued, the row lenght is maximized and we shoudl not have to be concerned. I did not account for the fact that it can go back to NULL and doing a rebuild when the column is NULL is as bad as inserting the row with IP being NULL and later valueing it.

    I have discussed this with my development team and I am removing the column.

    Using a FILLFACTOR might lessen the headache but it would only reduce index rebuilds to once every two weeks approximately. Since we have no dieing need for this value, the better solution for me is to remove the column.

    Respond if this is unclear or if you have questions or if you have an alternate solution.

    Thank You

  • That is a nice benefit of this forum - many times people will find that the issue resolution becomes clear once they write out the problem.

    Congrats on solving the issue.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Not to revive an old thread, but I ran across this when Googling for something else and if I found it someone else will too. In the spirit of educational purposes, for those who would not have the luxury of removing a column from the table, converting the column from a variable length type to a fixed length type would alleviate the issue. For an IP address field I would expect a CHAR(15) NULL would be a good choice.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • while we on this 🙂

    Personally I won't cluster the identity column.

    If one person have a single sign on I'd rather create a clustered index on this column. If 1 user can log in more than once

    then I'd give it more thought.

    Then instead of dropping the column (BTW you are a pretty good DBA if you can get dev to drop the column, you are definately intimidating enough)

    If you can't change the column , I'd move the column to the decimal IP address. It stores nice an tidy as int and the calculation to move it back and forward is pretty easy.

    Looking at your code again , once I remove the Clustered from the ID It seems that by doing on the userkey it would also help the update.

    Cheers

    Jannie

  • Jannie-186227 (7/19/2011)


    while we on this 🙂

    I hope I did not start anything here 🙂 My post was meant to help fellow Googlers in case they were on the hunt for fragmentation info.

    ...instead of dropping the column (BTW you are a pretty good DBA if you can get dev to drop the column, you are definately intimidating enough)

    I thought the same thing. That's like moving a mountain in a lot of places!

    If you can't change the column , I'd move the column to the decimal IP address. It stores nice an tidy as int and the calculation to move it back and forward is pretty easy.

    The INT is clever because it solves the fragmentation issue and takes less space than the CHAR. It is tidy, and it may be easy to convert back and forth, but it is worth mentioning that it does not happen without a cost. If you're using the technique everywhere you store IPs some of the issues can be mitigated. It's obviously not my first tendency. In some systems data consumers will also have a say. As usual, it will depend on the use-case.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply