tempdb.mdf on 2008 grown out of control than 2005.

  • :hehe:Hi Guys,

    After reading so many post, I am still in situation where my tempdb.mdf file grown all most 120GB.

    Here is my server config:

    SQL Server 2008 R2 on window server 2008.

    I have T:\ derive for my tempdb which is size as 120GB just for tempdb.

    Now, I have my ETL/production server which is SQL Server 2005 and I do not have any issue with this box on tempdb.mdf.

    both server has same job but somehow my tempdb.mdf file on sql server 2008 grown so much then my sql server 2005 (I do not have any issue on 2005 box,,its running like a charmmmm).

    tempdb config (both server config are same):

    - Enable AutoGrowth

    - File Growth = 10%

    - Maximum file size = Unrestricted Autogrowth

    - Initial size (.mdf) = 8 MB

    - Initial size (.ldf) = 1 MB

    I am looking for some guidance or advice since we just have our New box on 2008 plat form and we have to move our 3 more boxes on sql server 2008...

    Thanks everyone in advance.

    Thank You,

    😛

  • Growth of file is not depend on the version or edition you are using for sql server(thats what I think, and there are many expert here who can correct me if I am wrong).

    Coming to issue related to your growing tempdb file, seems like your most of the operation are going in tempdb. And one of those operation would be rebuild index with sort_in_tempdb. I think you need to check the jobs which are running and that can give you clue.

    And also there could be application which are using tempdb for most of operations in application and those query you can catch by profiler and filtering to dbname as tempdb and if you think there are long running queris as well(like report) then add further filter in profiler for duration

    ----------
    Ashish

  • My guess is that you are getting a different execution plan for a large query that is using tempdb for large sorts and/or large joins that wasn't happening on the 2005 server.

    If you can track the growth down to one query or procedure; you can capture the execution plan from both 2005 and 2008 R2 and compare them to see if there is an obvious difference.

    This could be caused by out of date statistics or possibly fragmentation. Or it could be something that has been fixed in one the of the CUs for 2008 R2. They are up to CU6. (Cumulative Updates) Note that for a lot of the query optimizer fixes you have to enable a trace flag, 4199 if I remember correctly, after installing the CU. (I have had to install CUs to resolve performance problems in 2008 R2 that we didn't have in 2005.)

    What version are you running on both servers? (SP and CU)

  • Hi

    Have you checked % of usage of your TempDB?

    use TempDB

    sp_spaceused

    Is the database "re-growing" or it was one time growth?(Could be some ill query ran by someone...?)

    In addition to running Profiler you could also monitor a few counters:

    PerfMon: SQLServer:Database (LogGrowths, LogFileSize etc)

    Also, In default trace, EventClass=92/93 should tell you the time of data/log file growth retrospectively so that might help to check the time when it is/was happening and try to relate it to some jobs, scheduled tasks, if any.

    To check default trace use:

    select * from fn_trace_GetInfo(0) - to find where it is

    select * from fn_Trace_GetTable('Path',0) - to get info

    when running select * from fn_trace_gettable filter it by EventClass=92 or 93 and DatabaseName or DatabaseId

    (don't forget to execute it for old traces as well!)

    Hope you will manage to find the cause 🙂

  • Hi Guys, thanks for your reply and here is my thought:

    @Ashish - My job contains 10 steps and each steps are stored procedures and there is no application connect to this server, it is SQL Server production server we have. 90% of tables are perm tables and most of get Truncate during process and reload with fresh data.

    @UMG - Can you tell me how to find version of SP & CU?

    @ModestyBlaise - I use your fn_trace script but not quite sure how to read the out put, it has pretty good information but i do not find a column where it give me size of tempdb or may be I am missing something...could you please provide me little more guidance...

    I appreciate guys for your kind input.

    - Keyun

  • keyun (4/7/2011)


    @UMG - Can you tell me how to find version of SP & CU?

    This should give you the SP and build number:

    SELECT @@Version;

    From the build number you can lookup the CU that is installed, if any.

  • Hi,

    Unfortunately, reading trace will not give information about size increase, but can give you information where the increase is coming from (ApplicationName, LoginName,StartTime)

    It should look something like:

    select * from fn_trace_getinfo(0)

    select * from fn_trace_gettable('PathReturnedByPreviousSelect',0)

    where EventClass in (92,93)

    and DatabaseName='TempDB'

    To capture change in size, you could monitor counters that I suggested.

    Since you are sure it is the job that is causing it, this might not be that useful.

    Have you checked the execution plan of your procedures?

    Are they creating temporary table? Table variable? Use CTEs?

    Is there any sorting?

    Are there hash joins in its executing plan?

    How many records is the process dealing with?

    If both jobs (one running on 2k5 and the other on 2k8) are running the same stored procedures and are dealing with the same number of records, compare execution plans for those procedures.

    When running profiler to trace them (in both cases) pay attention to number of logical reads/writes.

    When you figure out which exact procedure is problematic, if you are still not sure why it is happening, maybe you could send us more details about what it’s doing (code) and the exec plan.

  • Thanks guy for the update.

    @UMG - It is 2008 R2(RTM) - 10.50.1600.1 SP1

    @Modesty... - They way our DW structure is to use perm table than temp and there are lot of INSERT, UPDATE, TRUNC script are running where we load data from our source_db and message it. I would say we touch about 14 millions rows on nightly process. After your help with script, I have finally narrow down to one stored proc that I am going to run and monitor via trac...execution plan..etc..on both server 2k5 and 2k8.

    I appreciate for your help and I will keep you posted what i find.

    Thanks

  • keyun (4/8/2011)


    @UMG - It is 2008 R2(RTM) - 10.50.1600.1 SP1

    I think you pulled the SP1 from the end of the string which belongs to the OS version.

    You are running 2008 R2 with no SP or CU. So you are running a stock as originally released version. (Note: There haven't released a SP for 2008 R2 yet.)

    After you review the procedure that is causing the problem along with its execution plans, I suspect you will find that one of the CUs addresses an issue with the query optimizer picking a poor plan for one or more queries. (Assuming of course that it isn't bad statistics causing the poor plan choice.)

  • Hello All,

    Somewhat related and maybe not is I am incurring the same problem with a query that runs forever, over 6 hours before I cancel it, as the tempdb.mdf grows to 300GB+. The tempdb.log stays at 16MB. In my case, it runs just fine in 8 1/2 minutes on an older test server with nearly the same data. I even checked via DBCC and copied the two slightly different DBs back to the test server where it ran for over an hour but still finished, possibly taking longer due to other operations on it. I am alone so far on the server that does not work with no other operations on it.

    The twist here is while both are SQL 2008 R2, the one not working is 64 bit in both its SQL and OS.

    New Server – Does Not Work

    ProLiant DL380 G7

    Intel(R) Xeon(R) X5680 @ 3.33GHz, 3333 Mhz, 6 Core(s), 12 Logical Processor(s)

    36GB RAM / 56GB VM

    Microsoft Windows Server 2008 R2 Standard / Version 6.1.7601 Service Pack 1 Build 7601

    Microsoft SQL Server 2008 R2 (RTM) - 10.50.1600.1 (Intel X86) Apr 2 2010 15:53:02 Copyright (c) Microsoft Corporation Standard Edition on Windows NT 6.0 <X86> (Build 6002: Service Pack 2)

    Cumulative Update Patch #7 for SQL 2008 R2 released just last Friday is installed

    Old Server – Works

    ProLiant DL360 G4p

    Intel(R) Xeon(TM) CPU 3.60GHz, 3600 Mhz, 1 Core(s), 2 Logical Processor(s)

    4GB RAM / 7GB VM

    Microsoft® Windows Server® 2008 Standard / Version 6.0.6002 Service Pack 2 Build 6002

    Microsoft SQL Server 2008 R2 (RTM) - 10.50.1746.0 (X64) Oct 5 2010 12:32:09 Copyright (c) Microsoft Corporation Standard Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1)

    I think I’m dealing with the bleeding edge of Microsoft and 64 bit. A MS whitepaper I have said of the error I first got:

    802 - There is insufficient memory available in the buffer pool.

    Causes: This error does not necessarily indicate an out-of-memory condition. It might indicate that the buffer pool memory is used by someone else. In SQL Server 2008 and SQL Server 2005, this error should be relatively rare.

    Troubleshooting: Use the general troubleshooting steps and recommendations outlined for the 701 error.

    This is both funny and sad because I am alone on this server, this is the only SQL operation or app of any kind running, and this is the first time I ran the query on it. I have gone through the 701 error troubleshooting tips and came up empty and cannot replicate the error above when it first ran for 3 hours before bombing. I have not been able to replicate te erorr since but also cannot get the query to finish.

    The query is a lopside join with almost 3 million records joined to one subquery and 3 other tables under 50K recs each. In spite of that and a lack of indexes, it does not explain why the smaller box can finish it and the big one can't.

    Suggestions?

    Thanks,

    Joe

  • Otis P. Driftwood (5/4/2011)


    Hello All,

    Somewhat related and maybe not is I am incurring the same problem with a query that runs forever, over 6 hours before I cancel it, as the tempdb.mdf grows to 300GB+. The tempdb.log stays at 16MB. In my case, it runs just fine in 8 1/2 minutes on an older test server with nearly the same data. I even checked via DBCC and copied the two slightly different DBs back to the test server where it ran for over an hour but still finished, possibly taking longer due to other operations on it. I am alone so far on the server that does not work with no other operations on it.

    The twist here is while both are SQL 2008 R2, the one not working is 64 bit in both its SQL and OS.

    New Server – Does Not Work

    ProLiant DL380 G7

    Intel(R) Xeon(R) X5680 @ 3.33GHz, 3333 Mhz, 6 Core(s), 12 Logical Processor(s)

    36GB RAM / 56GB VM

    Microsoft Windows Server 2008 R2 Standard / Version 6.1.7601 Service Pack 1 Build 7601

    Microsoft SQL Server 2008 R2 (RTM) - 10.50.1600.1 (Intel X86) Apr 2 2010 15:53:02 Copyright (c) Microsoft Corporation Standard Edition on Windows NT 6.0 <X86> (Build 6002: Service Pack 2)

    Cumulative Update Patch #7 for SQL 2008 R2 released just last Friday is installed

    Old Server – Works

    ProLiant DL360 G4p

    Intel(R) Xeon(TM) CPU 3.60GHz, 3600 Mhz, 1 Core(s), 2 Logical Processor(s)

    4GB RAM / 7GB VM

    Microsoft® Windows Server® 2008 Standard / Version 6.0.6002 Service Pack 2 Build 6002

    Microsoft SQL Server 2008 R2 (RTM) - 10.50.1746.0 (X64) Oct 5 2010 12:32:09 Copyright (c) Microsoft Corporation Standard Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1)

    I think I’m dealing with the bleeding edge of Microsoft and 64 bit. A MS whitepaper I have said of the error I first got:

    802 - There is insufficient memory available in the buffer pool.

    Causes: This error does not necessarily indicate an out-of-memory condition. It might indicate that the buffer pool memory is used by someone else. In SQL Server 2008 and SQL Server 2005, this error should be relatively rare.

    Troubleshooting: Use the general troubleshooting steps and recommendations outlined for the 701 error.

    This is both funny and sad because I am alone on this server, this is the only SQL operation or app of any kind running, and this is the first time I ran the query on it. I have gone through the 701 error troubleshooting tips and came up empty and cannot replicate the error above when it first ran for 3 hours before bombing. I have not been able to replicate te erorr since but also cannot get the query to finish.

    The query is a lopside join with almost 3 million records joined to one subquery and 3 other tables under 50K recs each. In spite of that and a lack of indexes, it does not explain why the smaller box can finish it and the big one can't.

    Suggestions?

    Thanks,

    Joe

    The first thing I would say is to check the version information you posted. As it looks like you have the 32 bit version of SQL Server installed on the new server, and you have the 64-bit version on the old server. (I assume you just have some of the information swapped between the two, since there isn't an x86 version of Windows Server 2008 R2.)

    The biggest thing I noticed is that you aren't running the same version of SQL Server on both boxes. (The old server is running the RTM version of 2008 R2 with no CU installed.) So it could be the CU causing your problem. (Have you enabled the 4199 trace flag to turn on the query optimizer fixes on the new server?)

    Other things to check: You mention it is in a VM, is it all configured correctly? Have you ran SQLIO or something to verify you aren't having I/O issues? I would do is gather the execution plan for the same query on both servers and compare them. (If you can post them that would be great.)

  • You're right, SSCrazy. I did swap the SQL version info by accident...sorry. I did fix the problem, but it was not by any adjustments to the environment. I broke the nested query into two with a temp table. It just seems odd that instead of having it gradually take longer over time before I rework it as I have sometimes done, this thing worked fine one minute on one machine and then not at all when I moved it. It's about 15 min now, which is tolerable.

    I will try to when I get a chance to revisit it with your suggestions. I'm concerned some other process might cough up a hairball and wonder if it's something in the 64 bit environment that does not work with this situation without some adjustments away from the default settings.

    Joe

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply