Maintenance Plan is failing

  • Hi,

    I have Maintenance plan created on the database server which basically takes a backup daily and deletes the previous days backup file.

    Also, the log files are getting generated in D:\Program Files\Microsoft SQL Server\MSSQL\LOG folder.

    I can see all the log files which are getting generated .

    I have pasted the log file which has generated on 10/27/2009 and 10/28/2009.

    IF you see at the end of log1 , it clearly says unable to delete the backup files.

    Similarly, if you see the end of log2 , it says the old file is deleted successfully.

    The problem is, every alternate day the job is getting failed and we are getting and receiving an email saying that

    the backup has failed, but the actually the backup is being done successfully but was unable to delete the preivous day backup (RETENTION PERIOD = 1 DAY).

    How to trouble shoot this issue.

    One its working fine able to delete the prvious day bkp, but on the next it is unable to delete the previous day backup!!!!!

    Can anyone figure out why it is happening so.

    Also, find the attached screen shot for RETENTION period of 1 day.

    In the Notification section/TAB of the job, we are logging into Windows Event viewer incase of job failure.

    But i cannot see any log entry in the Event Viewer. This is obvious, because the job contains three 3 steps ,and it is getting Failed at step1 which fails at deleting the backup

    and goes to step3 which sends an Email saying "backup failed"

    ====================================

    log1 generated on 10/27/2009

    ====================================

    Starting maintenance plan 'All Application DBs' on 10/27/2009 9:30:00 PM

    [1] Database ActionOI_CDI: Database Backup...

    Destination: [E:\SQLDUMP\CDI_db_200910272130.BAK]

    ** Execution Time: 0 hrs, 0 mins, 1 secs **

    [2] Database ActionOI_CDI: Verifying Backup...

    ** Execution Time: 0 hrs, 0 mins, 1 secs **

    [3] Database ActionOI_CG: Database Backup...

    Destination: [E:\SQLDUMP\CDI_db_200910272130.BAK]

    ** Execution Time: 0 hrs, 1 mins, 37 secs **

    [4] Database ActionOI_CG: Verifying Backup...

    ** Execution Time: 0 hrs, 0 mins, 38 secs **

    [15] Database ActionOI_Template: Delete Old Backup Files...

    Unable to delete file E:\SQLDUMP\CDI_db_200910262132.BAK. 0 file(s) deleted.

    Deleting old text reports... 1 file(s) deleted.

    End of maintenance plan 'All Application DBs' on 10/27/2009 9:42:28 PM

    SQLMAINT.EXE Process Exit Code: 1 (Failed)

    ================================

    log2 generated on 10/28/2009

    ===============================

    [11] Database ActionOI_CDI: Delete Old Backup Files...

    1 file(s) deleted.

    [12] Database ActionOI_CG: Delete Old Backup Files...

    1 file(s) deleted.

    [13] Database ActionOI_Queue: Delete Old Backup Files...

    1 file(s) deleted.

    [14] Database ActionOI_RptLog: Delete Old Backup Files...

    1 file(s) deleted.

    [15] Database ActionOI_Template: Delete Old Backup Files...

    1 file(s) deleted.

    Deleting old text reports... 1 file(s) deleted.

    End of maintenance plan 'All Application DBs' on 10/28/2009 3:00:09 AM

    SQLMAINT.EXE Process Exit Code: 0 (Success)

    Below is the script which i have generated from GUI

    DECLARE @JobID BINARY(16)

    DECLARE @ReturnCode INT

    SELECT @ReturnCode = 0

    IF (SELECT COUNT(*) FROM msdb.dbo.syscategories WHERE name = N'Database Maintenance') < 1

    EXECUTE msdb.dbo.sp_add_category @name = N'Database Maintenance'

    -- Delete the job with the same name (if it exists)

    SELECT @JobID = job_id

    FROM msdb.dbo.sysjobs

    WHERE (name = N'DB Backup Job for DB Maintenance Plan ''DB Maintenance Plan ActionOI_Template''')

    IF (@JobID IS NOT NULL)

    BEGIN

    -- Check if the job is a multi-server job

    IF (EXISTS (SELECT *

    FROM msdb.dbo.sysjobservers

    WHERE (job_id = @JobID) AND (server_id <> 0)))

    BEGIN

    -- There is, so abort the script

    RAISERROR (N'Unable to import job ''DB Backup Job for DB Maintenance Plan ''DB Maintenance Plan ActionOI_Template'''' since there is already a multi-server job with this name.', 16, 1)

    GOTO QuitWithRollback

    END

    ELSE

    -- Delete the [local] job

    EXECUTE msdb.dbo.sp_delete_job @job_name = N'DB Backup Job for DB Maintenance Plan ''DB Maintenance Plan ActionOI_Template'''

    SELECT @JobID = NULL

    END

    BEGIN

    -- Add the job

    EXECUTE @ReturnCode = msdb.dbo.sp_add_job @job_id = @JobID OUTPUT , @job_name = N'DB Backup Job for DB Maintenance Plan ''DB Maintenance Plan ActionOI_Template''', @owner_login_name = N'sa', @description = N'No description available.', @category_name = N'Database Maintenance', @enabled = 1, @notify_level_email = 0, @notify_level_page = 0, @notify_level_netsend = 0, @notify_level_eventlog = 2, @delete_level= 0

    IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

    -- Add the job steps

    EXECUTE @ReturnCode = msdb.dbo.sp_add_jobstep @job_id = @JobID, @step_id = 1, @step_name = N'Step 1', @command = N'EXECUTE master.dbo.xp_sqlmaint N''-PlanID FC8F5D33-E6BD-4E61-BEAD-6BD563620322 -Rpt "d:\Program Files\Microsoft SQL Server\MSSQL\LOG\All Application DBs4.txt" -DelTxtRpt 4DAYS -WriteHistory -VrfyBackup -BkUpMedia DISK -BkUpDB "E:\SQLDUMP" -DelBkUps 1DAYS -CrBkSubDir -BkExt "BAK"''', @database_name = N'master', @server = N'', @database_user_name = N'', @subsystem = N'TSQL', @cmdexec_success_code = 0, @flags = 4, @retry_attempts = 0, @retry_interval = 0, @output_file_name = N'', @on_success_step_id = 2, @on_success_action = 4, @on_fail_step_id = 3, @on_fail_action = 4

    IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

    EXECUTE @ReturnCode = msdb.dbo.sp_add_jobstep @job_id = @JobID, @step_id = 2, @step_name = N'SendSuccess', @command = N'sendemail -f healthcare.prod@ABC.com -t babu@ABC.com;RAM@ABC.COM -cc XYZ@abc.com;traj@abc.com -u Backup Succeeded on DBSERVER01 -m Backup has been completed successfully -s 192.168.1.103', @database_name = N'', @server = N'', @database_user_name = N'', @subsystem = N'CmdExec', @cmdexec_success_code = 0, @flags = 0, @retry_attempts = 0, @retry_interval = 1, @output_file_name = N'', @on_success_step_id = 0, @on_success_action = 1, @on_fail_step_id = 0, @on_fail_action = 2

    IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

    EXECUTE @ReturnCode = msdb.dbo.sp_add_jobstep @job_id = @JobID, @step_id = 3, @step_name = N'SendFailed', @command = N'sendemail -f healthcare.prod@ABC.com -t babu@ABC.com;RAM@ABC.COM -cc XYZ@abc.com;traj@abc.com -u Backup Failed on DBSERVER01 -m Backup job failed please check -s 192.168.1.103', @database_name = N'', @server = N'', @database_user_name = N'', @subsystem = N'CmdExec', @cmdexec_success_code = 0, @flags = 0, @retry_attempts = 0, @retry_interval = 1, @output_file_name = N'', @on_success_step_id = 0, @on_success_action = 1, @on_fail_step_id = 0, @on_fail_action = 2

    IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

    EXECUTE @ReturnCode = msdb.dbo.sp_update_job @job_id = @JobID, @start_step_id = 1

    IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

    -- Add the job schedules

    EXECUTE @ReturnCode = msdb.dbo.sp_add_jobschedule @job_id = @JobID, @name = N'Schedule 1', @enabled = 1, @freq_type = 8, @active_start_date = 20060410, @active_start_time = 213000, @freq_interval = 126, @freq_subday_type = 1, @freq_subday_interval = 0, @freq_relative_interval = 0, @freq_recurrence_factor = 1, @active_end_date = 99991231, @active_end_time = 235959

    IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

    -- Add the Target Servers

    EXECUTE @ReturnCode = msdb.dbo.sp_add_jobserver @job_id = @JobID, @server_name = N'(local)'

    IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

    END

    COMMIT TRANSACTION

    GOTO EndSave

    QuitWithRollback:

    IF (@@TRANCOUNT > 0) ROLLBACK TRANSACTION

    EndSave:

    Thanks in advance

  • Hi All,

    Need to fix this on Prod.

    Environment : SQL 2000 Standard Edition on 2000 Server.

    Pl let me know if you need any details from my side. I will do that.

    Don't know why is it behaving in such a way. One day the previous backup is getting deleted and for the next day, it is failing to delete the old backup and sending an error mail saying that backup has been failed!! but the backup is successfull.

    Only thing is , unable to delete the previous days backup.

    How to fix this issue???????????

  • For more specific error messages, view the history for the maintenance plan instead of the logs. I don't find the logs useful at all.

    I suspect that your problem is going to be some other process has the older file locked when you try to delete it. This could be caused by a process trying to back up the file to tape - or copy it off to another location.

    I would check to see what time that folder is backed up to tape and see if the times are overlapping.

    Jeffrey Williams
    Problems are opportunities brilliantly disguised as insurmountable obstacles.

    How to post questions to get better answers faster
    Managing Transaction Logs

  • If the package cannot complete all steps, it fails. So that is the mail you are getting. It isn't a "backup" it's a package.

    One thing you can do is use a separate package/plan for the cleanup (deleting files) and have one just for the backups. Then you can have notification of backup failure only.

    You can chain these jobs to execute one after the other, so you can then diagnose the issue. I'm not sure why the delete fails some times. The thing that comes to mind is that the file is in use for some reason.

  • Super!

    I will check whether any process is locking the backup file at the same time before deletion.

    Also, i ll check the Maintenance Plan history.

    Thank you!

  • Steve Jones - Editor (11/1/2009)


    If the package cannot complete all steps, it fails. So that is the mail you are getting. It isn't a "backup" it's a package.

    One thing you can do is use a separate package/plan for the cleanup (deleting files) and have one just for the backups. Then you can have notification of backup failure only.

    You can chain these jobs to execute one after the other, so you can then diagnose the issue. I'm not sure why the delete fails some times. The thing that comes to mind is that the file is in use for some reason.

    Hi Jeffrey,

    I tried to see the Maintenance plan history but nothing much usefull information i get from it.

    Is there any way, to track what all processes which is locking the "......BAK" file ???

    if it is there , then it would be easy for me to trace out.

  • When you looked at the maintenance plan history - which task failed? What was the error message for the task that failed?

    Jeffrey Williams
    Problems are opportunities brilliantly disguised as insurmountable obstacles.

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Hi Jeffery,

    This is what i found.

    Activity

    1.Backup database

    2.Verify Backup

    3.Delete old db backup files

    I applied the Filter on Plan name , Server name , database and Status : Failed

    Few records have been filtered and when i double click on it , it is showing

    Activity : Delete old db backup files

    Status : Failed

  • I tried to check out if there is any entry made in the system event viewer --> under Application ,

    i can find nothing related to the failure of the steps.

    I can find only an entry for the database backup.

    I can the activity has failed only inside the LOGs which has been configured when the Maintenance Plan was created i.e. Reports Tab and which i had already posted in my above first post.

  • Is there anything like, while taking the backup , it is trying to delete it??

    But, it is clearly shown in the below log, once after taking the backup , it is verifying the backup and then deleting it. Right ???

    what could be other reasons.. if it is show, atleast it should have place any entry inside Event Viewer.

    Again pasting the log for reference.

    Microsoft (R) SQLMaint Utility (Unicode), Version Logged on to SQL Server 'SQL01' as 'EDFUSION\sql_service' (trusted)

    Starting maintenance plan 'All Application DBs' on 11/7/2009 9:30:00 PM

    [1] Database ActionOI_CDI: Database Backup...

    Destination: [E:\SQLDUMP\EDFUSION_CDI\EDFUSION_CDI_db_200911072130.BAK]

    ** Execution Time: 0 hrs, 0 mins, 1 secs **

    [2] Database EDFUSION_CDI: Verifying Backup...

    ** Execution Time: 0 hrs, 0 mins, 1 secs **

    [3] Database EDFUSION_CG: Database Backup...

    Destination: [E:\SQLDUMP\EDFUSION_CG\EDFUSION_CG_db_200911072130.BAK]

    ** Execution Time: 0 hrs, 1 mins, 31 secs **

    [4] Database EDFUSION_CG: Verifying Backup...

    ** Execution Time: 0 hrs, 0 mins, 33 secs **

    [5] Database EDFUSION_Queue: Database Backup...

    Destination: [E:\SQLDUMP\EDFUSION_Queue\EDFUSION_Queue_db_200911072132.BAK]

    ** Execution Time: 0 hrs, 0 mins, 8 secs **

    [6] Database EDFUSION_Queue: Verifying Backup...

    ** Execution Time: 0 hrs, 0 mins, 3 secs **

    [7] Database EDFUSION_RptLog: Database Backup...

    Destination: [E:\SQLDUMP\EDFUSION_RptLog\EDFUSION_RptLog_db_200911072132.BAK]

    ** Execution Time: 0 hrs, 0 mins, 3 secs **

    [8] Database EDFUSION_RptLog: Verifying Backup...

    ** Execution Time: 0 hrs, 0 mins, 1 secs **

    [9] Database EDFUSION: Database Backup...

    Destination: [E:\SQLDUMP\EDFUSION\EDFUSION_db_200911072132.BAK]

    ** Execution Time: 0 hrs, 6 mins, 11 secs **

    [10] Database EDFUSION: Verifying Backup...

    ** Execution Time: 0 hrs, 3 mins, 12 secs **

    [11] Database EDFUSION_CDI: Delete Old Backup Files...

    1 file(s) deleted.

    [12] Database EDFUSION_CG: Delete Old Backup Files...

    1 file(s) deleted.

    [13] Database EDFUSION_Queue: Delete Old Backup Files...

    1 file(s) deleted.

    [14] Database EDFUSION_RptLog: Delete Old Backup Files...

    1 file(s) deleted.

    [15] Database EDFUSION: Delete Old Backup Files...

    Unable to delete file E:\SQLDUMP\EDFUSION\EDFUSION_db_200911062132.BAK. 0 file(s) deleted.

    Deleting old text reports... 1 file(s) deleted.

    End of maintenance plan 'All Application DBs' on 11/7/2009 9:41:43 PM

    SQLMAINT.EXE Process Exit Code: 1 (Failed)

  • If you take a look at the below sequence of taking and backup,verifying and deleting the previous day backup, "EDFUSION" is the last database which is getting deleted.

    I also filtered for Failed Status Activities inside the maintenance plan history, "EDFUSION" is always the last database which is getting deleted and the job is getting failed for this particular database only. I didnt see in failures for the remaining database backups.

    Sequence of taking the backups

    EDFUSION_CDI

    EDFUSION_CG

    EDFUSION_Queue

    EDFUSION_RptLog

    EDFUSION

  • Ok, never mind. I looked at your code and I saw what was happening...

    Try this if its not a mission critical system to see if its your backup job or if its something being accessed.

    Put a job to stop SQL and then start it using Single User mode. Then have your routine run and see if it works just fine.

    You could do this manually or writting a simple VBS or batch file.

  • Hi,

    I dont have required permissions to restart the sql server in single user mode. That is not allowed. Is there any other alternative method through which i should be able to gather information about that backup file/ any file at OS and say that it being locked or it is in use by another process!!

    Is there any sql code/another way, through which i can trace all the processes which is accessing this file at that particular moment of time???

    Any help would be greatly appreciated.

    Thanks in advance.

  • Start>Administrative Tools>Computer Management.

    Action>Connect to another computer.

    Type in Server Name

    click on Shared Folders>Sessions>OpenFiles

    Look for the username that is accessing that file.

  • Hi,

    The solution was superb.

    Do you have similar kind of thing through which we check from command line i.e dos prompt. or else does this information is stored/ archived or else can we archive this information into text file so that we can show it as proof of concept.

    Am asking this because the Job will run midnight so if we can trace and audit those share information / file access information then it would be great!

    Thanks in advance.

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply