Tempdb 838GB's 2 mornings in a row

Question

Post reply

Tempdb 838GB's 2 mornings in a row

SkyBox

SSCarpal Tunnel

Points: 4288
More actions
December 8, 2011 at 7:58 am

#253267

The initial size for my Tempdb is 50gb and it normally does not exceed that by much. My db's on this prod box only use up about 600gb.

There's some new, massive process running and I've narrowed down the time period, but can't get the size of the current temp tables, to back track it to the process.

Development claims they haven't added any new processes or changed anything, and I have no maint. jobs running during these days. It's done by 7am, so I haven't been able to catch it in action and thought it was just a freak occurrence the first time. I restarted sql last night and the temp file was 50gb, but it's back up again this am.

Suppose I could set up a trace tonight and hope that it happens again around the same time period, but I 'd much rather end this today.

Anyone else ever run into this? Any suggestions? Just a matter of time before this process eats up all free space on the drive and crashes the system.

Viewing 15 posts - 1 through 15 (of 32 total)

You must be logged in to reply to this topic. Login to reply

Lowell SSC Guru Points: 323482 More actions · Answer 1

rebuilding indexes wth sort in tempdb, maybe?

Lowell

--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

SkyBox SSCarpal Tunnel Points: 4288 More actions · Answer 2

SkyBox

SSCarpal Tunnel

Points: 4288

December 8, 2011 at 8:52 am

#1418734

Lowell (12/8/2011)

rebuilding indexes wth sort in tempdb, maybe?

Nope

MyDoggieJessie SSC-Forever Points: 44461 More actions · Answer 3

Have you checked your SQL Logs? Have you checked the default trace file (should be in the log folder of your default SQL installation folder) to see if there's something caught there.

Consider creating an alert to notifying you during the time of significant growth:

USE [msdb]



GO

EXEC msdb.dbo.sp_add_alert @name=N'DBA - TempDB Growing Larger than usual',

		@message_id=0,

		@severity=0,

		@enabled=1,

		@delay_between_responses=300,

		@include_event_description_in=1,

		@category_name=N'[system]',

		@performance_condition=N'SQLServer:Databases|Log File(s) Used Size (KB)|tempdb|>|<<YOUR VALUE HERE IN BYTES>>',

		@job_id=N'00000000-0000-0000-0000-000000000000'

GO

______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

SkyBox SSCarpal Tunnel Points: 4288 More actions · Answer 4

MyDoggieJessie (12/8/2011)

Have you checked your SQL Logs? Have you checked the default trace file (should be in the log folder of your default SQL installation folder) to see if there's something caught there.

Consider creating an alert to notifying you during the time of significant growth:

Nothing in the sql logs, I do have alerts setup for drive space, and I know what time the free space reached it's threshold. I will review the default trace logs. thx

MyDoggieJessie SSC-Forever Points: 44461 More actions · Answer 5

MyDoggieJessie

SSC-Forever

Points: 44461

December 8, 2011 at 1:55 pm

#1418962

Did you find anything?

______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

matt.newman Hall of Fame Points: 3910 More actions · Answer 6

Could skim through SQL agent jobs execution time for anything that ran around that time if applicable. There might be a 2 day trend in activity for one job that does not normally take as long.

use msdb;

sp_help_jobhistory; /* plus a slew of other objects to sift through and other methods such as agent logs */

Even if someone says they did nothing, I do a quick check. Might want to do a quick check on objects in case anything comes to mind.

master..sp_msforeachdb



'if db_id(''?'') > 4

begin

	exec(''select ''''?'''', * from [?].sys.objects where modify_date > dateadd(day,-3,getdate())'')

end'

SkyBox SSCarpal Tunnel Points: 4288 More actions · Answer 7

SkyBox (12/8/2011)

MyDoggieJessie (12/8/2011)

Have you checked your SQL Logs? Have you checked the default trace file (should be in the log folder of your default SQL installation folder) to see if there's something caught there.

Consider creating an alert to notifying you during the time of significant growth:

Nothing in the sql logs, I do have alerts setup for drive space, and I know what time the free space reached it's threshold. I will review the default trace logs. thx

Going through the .trc file now - keep getting pulled away...

SkyBox SSCarpal Tunnel Points: 4288 More actions · Answer 8

I need to go back and look at 2-3 days worth. I am no expert at looking at trace data, although I see some intensive items, still need to get days of traces into a table and do some comparisons.

Feel free to take a peek at my attachment, which is the core processes running for the time period in question. Curious to see what stands out to you.

MyDoggieJessie SSC-Forever Points: 44461 More actions · Answer 9

Difficult to tell with out a StartTime reference as well but those are some seriously long durations. Looks to me that a LOT of SSIS jobs are running between 2am and 3:30am...you might want to take a look at what those are doing (in detail) and what is scheduled to run around those times.

What on earth is the "Picking Combinations"??? a quarter of a billion reads is unreal!!! If the underlying procedures contain a lot of temp tables, sorting, aggregates and such...that very well could be the root of your problem as that stuff will happen in the tempdb

______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

SkyBox SSCarpal Tunnel Points: 4288 More actions · Answer 10

matt.newman (12/8/2011)

Could skim through SQL agent jobs execution time for anything that ran around that time if applicable. There might be a 2 day trend in activity for one job that does not normally take as long.

use msdb;

sp_help_jobhistory; /* plus a slew of other objects to sift through and other methods such as agent logs */

Even if someone says they did nothing, I do a quick check. Might want to do a quick check on objects in case anything comes to mind.

master..sp_msforeachdb
'if db_id(''?'') > 4 begin exec(''select ''''?'''', * from [?].sys.objects where modify_date > dateadd(day,-3,getdate())'')
end'

No extended duration on the agent jobs - my backups are the longest and their duration didn't change. No objects changed. That job hist proc is awesome - didn't know about that.

SkyBox SSCarpal Tunnel Points: 4288 More actions · Answer 11

MyDoggieJessie (12/8/2011)

Difficult to tell with out a StartTime reference as well but those are some seriously long durations. Looks to me that a LOT of SSIS jobs are running between 2am and 3:30am...you might want to take a look at what those are doing (in detail) and what is scheduled to run around those times.

What on earth is the "Picking Combinations"??? a quarter of a billion reads is unreal!!! If the underlying procedures contain a lot of temp tables, sorting, aggregates and such...that very well could be the root of your problem as that stuff will happen in the tempdb

I think it's the pickingCombo app AND it does create a temp table with a specific name. The developer has built in a log that captures users' parameters/criteria. However, he claims that he looked it over and found nothing out of the ordinary when I asked him yesterday. More dates to compare will be helpful.

The IS jobs are exporting specific tables to a data warehouse - the duration is pretty consistent.

Jeffrey Williams SSC Guru Points: 89773 More actions · Answer 12

Here is what I would do...

Set a max size on each tempdb file so you don't exceed 100GB - since you say it normally doesn't grow much. Then, wait for the job to fail with the error that tempdb is full. If you don't see that error, wait for the developer to come to you and ask why his job failed with that error.

Yes, it is a bit drastic but it will identify the processes that are growing the files.

The autogrowth is going to be logged in the default trace. You can look there, but I don't think it will tell you anything more than when it grew and how much.

Jeffrey Williams
Problems are opportunities brilliantly disguised as insurmountable obstacles.

How to post questions to get better answers faster
Managing Transaction Logs

SkyBox SSCarpal Tunnel Points: 4288 More actions · Answer 13

Jeffrey Williams 3188 (12/9/2011)

Here is what I would do...

Set a max size on each tempdb file so you don't exceed 100GB - since you say it normally doesn't grow much. Then, wait for the job to fail with the error that tempdb is full. If you don't see that error, wait for the developer to come to you and ask why his job failed with that error.

Yes, it is a bit drastic but it will identify the processes that are growing the files.

The autogrowth is going to be logged in the default trace. You can look there, but I don't think it will tell you anything more than when it grew and how much.

I may have to consider it. It didn't happen Friday through today. I ran several processes that are scheduled during the time frame in question, including the SSIS packages and the tempDb didn't blow up at all. Although, it's nearly impossible to emulate what users were plugging into the VB app (pickCombo)that appears to be the culprit.

Only problem with capping the size of the TempDb, is that sql should fail-over to the other node once out of space, so sql service would restart. This would reset the size of tempDB and system would be back up within 30 seconds. Doubt that I could count on a user or developer to report it at 6am.

I will be monitoring closely and will report back with my findings.

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 14

SkyBox (12/8/2011)

MyDoggieJessie (12/8/2011)

Difficult to tell with out a StartTime reference as well but those are some seriously long durations. Looks to me that a LOT of SSIS jobs are running between 2am and 3:30am...you might want to take a look at what those are doing (in detail) and what is scheduled to run around those times.

What on earth is the "Picking Combinations"??? a quarter of a billion reads is unreal!!! If the underlying procedures contain a lot of temp tables, sorting, aggregates and such...that very well could be the root of your problem as that stuff will happen in the tempdb

I think it's the pickingCombo app AND it does create a temp table with a specific name. The developer has built in a log that captures users' parameters/criteria. However, he claims that he looked it over and found nothing out of the ordinary when I asked him yesterday. More dates to compare will be helpful.

The IS jobs are exporting specific tables to a data warehouse - the duration is pretty consistent.

The developer apparently doesn't understand what goes on behind the scenes. He or she has created an "accidental cross join" in the form of a "many-to-many" inner join and has probably covered it up with either a DISTINCT or a GROUP BY.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions