Understanding what ishappening during a backup to SAN

  • 3+ hours for a 40GB backup? Is your NIC 100mb or 1GB? Regardless, that is out of control ... sounds like you may be experiencing some blocking. I assume you're using activity monitor to look at this SPID correct? Scroll to the right and check out blocked/blocking.

    Also, what is the status of this spid? Running/Suspended/Sleeping?

    Ive had a look , theres no blocking but the spid is suspended with a wait type of ASYNC_IO_COMPLETION

    on the top row,BACKUPBUFFER in the middle row,IO_COMPLETION goes on and off on the last row where there is CPU activity

    Hmm, these are types of problems I haven't had the privilege of experiencing ... usually suspended tasks are a direct result of an IO wait and your wait type definitely helps back that.

    Have you attempted to back up the database locally or to another location to see what happens? My gut feeling is that you either have a network issue to your destination, or something up with the destination drive itself. I'm not much of a hardware guy, can't really shed more light than that at this time.

    I'd say stop the process, try to back up locally and if successful, hopefully you at least can eliminate SQL from the equation.

  • Hi,

    I may be missing some of the post, but from what I have been able to read, sounds like this may be disk subsystem related. What is your backup destination? What is the RAID configuation, if any? How many filegroups does the database(s) contain? What other activity have you observed on server?


    Phillip Cox

    MCITP - DBAdmin

  • This is in the error log:

    Unknown,The operating system returned error 1450(Insufficient system resources exist to complete the requested service.) to SQL Server during a write at offset 0x000021fd852200 in file with handle 0x000019D8. This is usually a temporary condition and the SQL Server will keep retrying the operation. If the condition persists then immediate action must be taken to correct

    Ive cancelled the backup and am going to try a new approach.

    Im backing up to a remote path \\lon-nas01\fullbackups\backup.bak which is on a SAN storage on a RAID 1 storage. There are quite a few processes running so I think thats not helping things.

    2 Filegroups, Primary and INdexes.

    At this point the NAS is the only box on the SAN. I am going to connect the passive node of this DB that im trying to backup to the SAN

    , do a failover,present a disk to the active node and do a backup.

  • As has been stated, you've added the network component into the mix. Backing up to a SAN is going to be slower than a local backup. Did you look at third-party tools (LightSpeed, SqlSafe, etc.)? These tool can compress the backup and speed up the process. Or can you backup local, maybe zip the file, and move the backup file to the SAN? How are you determining 40Gb short of completion?? If you're looking at the file on the SAN, my experience has been it'll read zero file size until completion occurs, so I wouldn't use that as a benchmark.

  • We use redgate on most of our servers. For whatever reason this server did not have it installed

    I instaled it yesterday and tried to do a backup but it gave me connectivity problems.

    The full backup size (.BAk) is 180gb. It was sitting at 140GB. But now I know not to trust SAN appearance

    I dont have space locally for that size backup. I know redgate can compress by up to 5 times but I still dont have 30GB spare to do that, so I have no choice but to backup to the network.

  • I have to agree with Philip about the disk subsystem. Are any errors being raised? Can you backup a smaller database using Redgate? I don't believe it's the backup software. It's got to be hardware related or network. Get one of the infrastructure guys to help you trouble shoot (unless of course you wear that hat too!).

  • Hi

    Yesterday we started using a SAN and I am trying to backup a DB to it. The backup used to take around 2 hours to run but this backup has been running for over 3.

    I have 1 SPID associated with the backup showing 3 rows in SP_WHO2

    cputime Diskio

    470 204

    8734 76349

    1021719 0

    The last row CPUTIME is increasing but all the others are still

    Im trying to understand what is going on on a granular level here. I cant find any good documentation on how SQL does backups. There only seems documentation on "HOW TO" do backups.

    The current state of the backup is that it is 40GB short of completion but its been running for 3 hours.The spid CPU is doing something -- but what???

    Please can you point me to a good source of high level SQL technical documentation on backups if you know of any

    Also how do people feel about SAN's. This is the first time im using one.



