DTS Scheduled package HANGS sporadically

  • Hi,

    I am running a dozen or so of DTS Scheduled jobs which call a DTS package each. The packages basically are pooling data from views of the Production server with SQL Server 2000 to another Reporting Server with SQL Server 2000 again (Refer a representation below).

    Production Server(SQL Sever 2000) View ------DATA-----> Tables of

    Reporting Server(SQL Sever 2000)

    The packages have logging enabled. The jobs are scheduled to run at 1 AM everyday. The problem which I am seeing is that the jobs seem to run fine most of the times. But sometimes sporadically 1 or 2 jobs (without much of a pattern of which job it is) HANG up. HANG in the sense; the jobs are the Execution state for more than 10-15 hours, without ever completing. I have error mailing enabled in the jobs, but since jobs (which hang) are typically not failing, so I am not getting notified that the jobs did not complete. If I just go ahead and stop the job, restart it again it runs through successfully!

    My questions are

    1. Why do these jobs HANG up? Is there anything I can do to avoid this from happening?

    2. Can I can automate the process of checking that jobs did not omplete after a particular interval of time, then stop the job and restart the job again.

    3. How can I get notified by e-mail if these jobs remain in the HANG state after a particular interval of time since the job started?

    Thanks a lot for your thoughts!

  • Not familiar with your specific jobs, but I had a similar problem a long time ago with other installed software (antivirus software) kicking off updates and nightly scans that conflicted with some files my DTS jobs relied on.  Do you have any other applications running on the machine that kick off regularly scheduled jobs or anything else that might conflict?

  • I do not think that is the case, for it is only one or two of the jobs which HANG up and not the other jobs which were running in parallel and pooling the data from the same Server(but from different views) get done successfully.
     
    So, I do not think it should be the result of some other application.
  • Just throwing out ideas...  that's what troubleshooting is about

  • More ideas thrown out...

    Have you looked at locking / contention issues?

    Are there loops in these procs that might be getting "stuck"?

    Have you run a profiler trace to see what SQL is being run?

  • Responses Inline

    Have you looked at locking / contention issues?

    I have not looked into something related to locking specifically. Basically what I have included is a 'NO LOCK' statement in the the views. Can you give me some ideas, of what kind of corrections I should be looking at ?

    Are there loops in these procs that might be getting "stuck"?

    There are no loops.

    Have you run a profiler trace to see what SQL is being run?

    Nope, I have not. I am sure to check it next time. What should I be looking at in the profiler?

    Any other thoughts.

    Thanks.

  • Locking:

    Locking / contention should show up if you issue a sp_who2 command.

    You might also want to consider using:

    SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

    on any stored procs / sql statements as long as they are read-only.

    Profiler:

    In the profiler, look for SQL statements that repeat, look at the last transaction for the "trouble" package, look at CPU usage.

    One other thing:

    You may want to look and see if this is a networking issue.

  • Thanks for all your thoughts to Pam and anyone else who answered to my post.
     
    If anyone has some more thoughts/suggestions to add please feel free to add some more.
  • You say you put the NOLOCK hint on all the tables in your views right?

  • Yes, I do.

  • 2 random thoughts from myself... as this is something I have encountered in a slightly different environment:

    Are the source and destination authenticating as service accounts... the same account?

    I ask because one of the jobs I had a while back hung every now and then, it took me about 3 times to realise that it coincided with the days my password was up for renewal on the destination and the destination (teradata) was prompting the job to change password... the job obviously didn't have a clue what the destination was waiting for and so they just both sat there... waiting for human intervention!

    Secondly... is anything else which is running at the same time in a Transaction that may not have comitted by the time your view is trying to reference the tables?

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply