Tempdb 838GB's 2 mornings in a row

  • The initial size for my Tempdb is 50gb and it normally does not exceed that by much. My db's on this prod box only use up about 600gb.

    There's some new, massive process running and I've narrowed down the time period, but can't get the size of the current temp tables, to back track it to the process.

    Development claims they haven't added any new processes or changed anything, and I have no maint. jobs running during these days. It's done by 7am, so I haven't been able to catch it in action and thought it was just a freak occurrence the first time. I restarted sql last night and the temp file was 50gb, but it's back up again this am.

    Suppose I could set up a trace tonight and hope that it happens again around the same time period, but I 'd much rather end this today.

    Anyone else ever run into this? Any suggestions? Just a matter of time before this process eats up all free space on the drive and crashes the system.

  • rebuilding indexes wth sort in tempdb, maybe?

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Lowell (12/8/2011)


    rebuilding indexes wth sort in tempdb, maybe?

    Nope

  • Have you checked your SQL Logs? Have you checked the default trace file (should be in the log folder of your default SQL installation folder) to see if there's something caught there.

    Consider creating an alert to notifying you during the time of significant growth:

    USE [msdb]

    GO

    EXEC msdb.dbo.sp_add_alert @name=N'DBA - TempDB Growing Larger than usual',

    @message_id=0,

    @severity=0,

    @enabled=1,

    @delay_between_responses=300,

    @include_event_description_in=1,

    @category_name=N'[system]',

    @performance_condition=N'SQLServer:Databases|Log File(s) Used Size (KB)|tempdb|>|<<YOUR VALUE HERE IN BYTES>>',

    @job_id=N'00000000-0000-0000-0000-000000000000'

    GO

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • MyDoggieJessie (12/8/2011)


    Have you checked your SQL Logs? Have you checked the default trace file (should be in the log folder of your default SQL installation folder) to see if there's something caught there.

    Consider creating an alert to notifying you during the time of significant growth:

    Nothing in the sql logs, I do have alerts setup for drive space, and I know what time the free space reached it's threshold. I will review the default trace logs. thx

  • Did you find anything?

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • Could skim through SQL agent jobs execution time for anything that ran around that time if applicable. There might be a 2 day trend in activity for one job that does not normally take as long.

    use msdb;

    sp_help_jobhistory; /* plus a slew of other objects to sift through and other methods such as agent logs */

    Even if someone says they did nothing, I do a quick check. Might want to do a quick check on objects in case anything comes to mind.

    master..sp_msforeachdb

    'if db_id(''?'') > 4

    begin

    exec(''select ''''?'''', * from [?].sys.objects where modify_date > dateadd(day,-3,getdate())'')

    end'

  • SkyBox (12/8/2011)


    MyDoggieJessie (12/8/2011)


    Have you checked your SQL Logs? Have you checked the default trace file (should be in the log folder of your default SQL installation folder) to see if there's something caught there.

    Consider creating an alert to notifying you during the time of significant growth:

    Nothing in the sql logs, I do have alerts setup for drive space, and I know what time the free space reached it's threshold. I will review the default trace logs. thx

    Going through the .trc file now - keep getting pulled away...

  • I need to go back and look at 2-3 days worth. I am no expert at looking at trace data, although I see some intensive items, still need to get days of traces into a table and do some comparisons.

    Feel free to take a peek at my attachment, which is the core processes running for the time period in question. Curious to see what stands out to you.

  • Difficult to tell with out a StartTime reference as well but those are some seriously long durations. Looks to me that a LOT of SSIS jobs are running between 2am and 3:30am...you might want to take a look at what those are doing (in detail) and what is scheduled to run around those times.

    What on earth is the "Picking Combinations"??? a quarter of a billion reads is unreal!!! If the underlying procedures contain a lot of temp tables, sorting, aggregates and such...that very well could be the root of your problem as that stuff will happen in the tempdb

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • matt.newman (12/8/2011)


    Could skim through SQL agent jobs execution time for anything that ran around that time if applicable. There might be a 2 day trend in activity for one job that does not normally take as long.

    use msdb;

    sp_help_jobhistory; /* plus a slew of other objects to sift through and other methods such as agent logs */

    Even if someone says they did nothing, I do a quick check. Might want to do a quick check on objects in case anything comes to mind.

    master..sp_msforeachdb

    'if db_id(''?'') > 4

    begin

    exec(''select ''''?'''', * from [?].sys.objects where modify_date > dateadd(day,-3,getdate())'')

    end'

    No extended duration on the agent jobs - my backups are the longest and their duration didn't change. No objects changed. That job hist proc is awesome - didn't know about that.

  • MyDoggieJessie (12/8/2011)


    Difficult to tell with out a StartTime reference as well but those are some seriously long durations. Looks to me that a LOT of SSIS jobs are running between 2am and 3:30am...you might want to take a look at what those are doing (in detail) and what is scheduled to run around those times.

    What on earth is the "Picking Combinations"??? a quarter of a billion reads is unreal!!! If the underlying procedures contain a lot of temp tables, sorting, aggregates and such...that very well could be the root of your problem as that stuff will happen in the tempdb

    I think it's the pickingCombo app AND it does create a temp table with a specific name. The developer has built in a log that captures users' parameters/criteria. However, he claims that he looked it over and found nothing out of the ordinary when I asked him yesterday. More dates to compare will be helpful.

    The IS jobs are exporting specific tables to a data warehouse - the duration is pretty consistent.

  • Here is what I would do...

    Set a max size on each tempdb file so you don't exceed 100GB - since you say it normally doesn't grow much. Then, wait for the job to fail with the error that tempdb is full. If you don't see that error, wait for the developer to come to you and ask why his job failed with that error.

    Yes, it is a bit drastic but it will identify the processes that are growing the files.

    The autogrowth is going to be logged in the default trace. You can look there, but I don't think it will tell you anything more than when it grew and how much.

    Jeffrey Williams
    Problems are opportunities brilliantly disguised as insurmountable obstacles.

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Jeffrey Williams 3188 (12/9/2011)


    Here is what I would do...

    Set a max size on each tempdb file so you don't exceed 100GB - since you say it normally doesn't grow much. Then, wait for the job to fail with the error that tempdb is full. If you don't see that error, wait for the developer to come to you and ask why his job failed with that error.

    Yes, it is a bit drastic but it will identify the processes that are growing the files.

    The autogrowth is going to be logged in the default trace. You can look there, but I don't think it will tell you anything more than when it grew and how much.

    I may have to consider it. It didn't happen Friday through today. I ran several processes that are scheduled during the time frame in question, including the SSIS packages and the tempDb didn't blow up at all. Although, it's nearly impossible to emulate what users were plugging into the VB app (pickCombo)that appears to be the culprit.

    Only problem with capping the size of the TempDb, is that sql should fail-over to the other node once out of space, so sql service would restart. This would reset the size of tempDB and system would be back up within 30 seconds. Doubt that I could count on a user or developer to report it at 6am.

    I will be monitoring closely and will report back with my findings.

  • SkyBox (12/8/2011)


    MyDoggieJessie (12/8/2011)


    Difficult to tell with out a StartTime reference as well but those are some seriously long durations. Looks to me that a LOT of SSIS jobs are running between 2am and 3:30am...you might want to take a look at what those are doing (in detail) and what is scheduled to run around those times.

    What on earth is the "Picking Combinations"??? a quarter of a billion reads is unreal!!! If the underlying procedures contain a lot of temp tables, sorting, aggregates and such...that very well could be the root of your problem as that stuff will happen in the tempdb

    I think it's the pickingCombo app AND it does create a temp table with a specific name. The developer has built in a log that captures users' parameters/criteria. However, he claims that he looked it over and found nothing out of the ordinary when I asked him yesterday. More dates to compare will be helpful.

    The IS jobs are exporting specific tables to a data warehouse - the duration is pretty consistent.

    The developer apparently doesn't understand what goes on behind the scenes. He or she has created an "accidental cross join" in the form of a "many-to-many" inner join and has probably covered it up with either a DISTINCT or a GROUP BY.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 15 posts - 1 through 15 (of 32 total)

You must be logged in to reply to this topic. Login to reply