MS SQL 2008 Standard backups cause application jdbc timeouts

Question

Post reply

MS SQL 2008 Standard backups cause application jdbc timeouts

andrew.bruno

SSC Veteran

Points: 256
More actions
September 25, 2009 at 12:19 am

#211410

I have a situation where the backing up 16 databases is taking around 1 hour. This doesn’t bother me so much.

What does, is the fact that during this hour, transactions timeout or expire. Its like the backing up is locking the database or something like that?

I also have a ReIndexing maintenance plan (which also runs for an hour), and this is also causing transactions to timeout, but not as bad as the backup maintenance plan.

I am using:

SQL 2008 Standard edtion on a Quad Core, 16G, 4x146G RAID1+0

The server has 16 databases, the largest being 50G, followed by 10G, 3.5G, a couple of 2G and the rest under 1G.

Any ideas?

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply

dba_pkashyap Hall of Fame Points: 3336 More actions · Answer 1

when do the backups/indexes jobs run, I mean at what time?

They should run only when the user activity is less...see if there is a chance to run them at a different time. Is it a full backup running daily...amend the maintenance plans and try implementing differential backups with transaction logs.

What are 16 databases doing there??are all of them production db's, if not then why are all sitting on the same server?

Silverfox SSC-Insane Points: 23150 More actions · Answer 2

andrew.bruno (9/25/2009)

I have a situation where the backing up 16 databases is taking around 1 hour. This doesn’t bother me so much.

What does, is the fact that during this hour, transactions timeout or expire. Its like the backing up is locking the database or something like that?

I also have a ReIndexing maintenance plan (which also runs for an hour), and this is also causing transactions to timeout, but not as bad as the backup maintenance plan.

I am using:

SQL 2008 Standard edtion on a Quad Core, 16G, 4x146G RAID1+0

The server has 16 databases, the largest being 50G, followed by 10G, 3.5G, a couple of 2G and the rest under 1G.

Any ideas?

As suggested re-schedule your jobs, re-indexing should be done at night or at weekends when there is no activity. your full backups could be done at night after the last user has come off the system, you could schedule differential backups instead if you are doing a lot of log backups and want to cut down on the restore time if you need to restore.

--------------------------------------------------------------------------------------
[highlight]Recommended Articles on How to help us help you and[/highlight]
[highlight]solve commonly asked questions[/highlight]

Forum Etiquette: How to post data/code on a forum to get the best help by Jeff Moden[/url]
Managing Transaction Logs by Gail Shaw[/url]
How to post Performance problems by Gail Shaw[/url]
Help, my database is corrupt. Now what? by Gail Shaw[/url]

andrew.bruno SSC Veteran Points: 256 More actions · Answer 3

All 16 databases are production databases on this SQL server instance. The web systsem is a 24/7 international application, so uptime is always required, and there are transactions happening all the time. I dont understand how I can run maintenance plans at a low peak time, when there is always peak. Also, the current server has about 500 customers spread across the DBs. I can in theory split all the DBs, so I could have 500 dbs, each representing a customer, but I dont know how SQL server would handle that many DBs.

At the moment I have the following maintenance plans:

FullBackups:

- Occurs every day every 12 hour(s) between 12:30:00 AM and 12:25:00 AM.

Differential Backups:

- Occurs every day every 30 minute(s) between 3:00:00 AM and 11:59:59 AM.

- Occurs every day every 30 minute(s) between 3:00:00 PM and 11:59:59 PM.

RebuildIndex:

- Occurs every day at 2:30:00 AM

Is this a limitation of SQL Standard. Will upgrading to SQL Enterprise solve this?

How are other 24/7 enterprise systems handling backups? Should I consider a master/slave cluster instead?

Any other ideas?

george sibbald SSC Guru Points: 104210 More actions · Answer 4

That sounds rather frequent for the backups. Depends on your recovery requirements but daily full backups usually more than suffice with say 4 hourly diffs and tran logs between. Thats the classic scenario on a heavily used system.

Are you backing up to a different drive than the dbs sit on? Multiple backups will cause I/O spikes so spreading the I/O load is worth while.

SQL will support 500 dbs but that does not mean its a good idea to do so.

---------------------------------------------------------------------

dba_pkashyap Hall of Fame Points: 3336 More actions · Answer 5

As per your statements,

You are doing full backup twice daily and diff every 30 min.

Change the full backup to schedule as only once a day, and differential every 4 hrs and tran every 15 mins as everyone suggested.

Are all the 16 databases pointing to the same application?

Check the hardware infrastructure like RAID's, SAN/NAS etc and see if the backup is saved in the same location as the original mdf/ldf files.

Check for the tempdb size and if possible move them to a new disk/drive.

Ask your hardware team to create new mount points with external disks and re-point your databases mdf and ldf's to different disks.

Good Luck!

Pavan

EdVassie SSC Guru Points: 60312 More actions · Answer 6

Are you running 32-bit or 64-bit?

On 32-bit servers you have limited memory available to run things such as backups. This memory is also shared with other tasks, such as application memory.

You may see your problem disappear if you upgrade to 64-bit. Alternatively, you can stagger the start times of your backups so that a maximum of (say) 4 are running at one time.

Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

andrew.bruno SSC Veteran Points: 256 More actions · Answer 7

Its a 64bit OS with 16G ram.

Most of the backups run under 5minutes, its just that two of the databases (the largest ones), i.e. 50G take about 50 minutes each to backup.

Does anyone know whether SQL Standard 2008 does a transaction lock on DB when it backs up? or whether the index is taken offline, so something along those lines..

EdVassie SSC Guru Points: 60312 More actions · Answer 8

SQL Server takes a shared schema lock when a backup is taken. This prevents the database from being dropped while the backup is running, but does not prevent any other activity.

Do your long-running backups output to a single file? If you output to multiple files (no more than 1 file per core), then SQL starts multiple threads to read the database. Each thread then backs up a portion of the database to its own backup file.

Changing from a single output file to 4 output files can reduce the backup time by up to 75%. You will need to test what works best in your system, also taking into account the impact the backup will have on your query workload.

(I am not sure of the algorithm used, but possibly it divides the space map pages by the number of files and assigns a set of space map pages to each thread, telling them to back up the pages referenced in the space maps.)

Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

andrew.bruno SSC Veteran Points: 256 More actions · Answer 9

andrew.bruno

SSC Veteran

Points: 256

September 30, 2009 at 3:03 am

#1059851

Yes each database is backed up to a single new .bak file. The largest file is 28G at the moment.

andrew.bruno SSC Veteran Points: 256 More actions · Answer 10

andrew.bruno

SSC Veteran

Points: 256

September 30, 2009 at 3:06 am

#1059853

Also got this thread going..

http://social.msdn.microsoft.com/Forums/en/sqldatabaseengine/thread/87ec685c-cbe5-4884-a6d7-848ba7eea55d

philcart SSC-Forever Points: 47794 More actions · Answer 11

4x146G RAID1+0

Sounds like a single disk array on which you have OS, paging file, tempdb, data files, log files and backups. You may have partitioned the array into multiple logical volumes, but it is still one physical array doing the work.

I'd check the disk I/O while backups are being performed.

As a rough guess I'd say your queries are being killed by long disk queues.

For SQL you should have 2-3 seperate physical arrays to seperate the I/O. With your current hardware, you'd probably be better served having 2 x 146G (Raid 1) x 2

--------------------
Colt 45 - the original point and click interface

andrew.bruno SSC Veteran Points: 256 More actions · Answer 12

Actually its just C and E drive. C has 30G, and E the rest.

Sounds like I'll need to organise a new server, and migrate dbs across.

What would be the ideal specs, say to get 500G hd, and 32G RAM.

Do you think its worth buying the Enterprise licence rather then the Standard licence. I know that the reindexing will stay online with enterprise. Are there any other advantages?

or should I consider getting 2 standards servers, and using replication between the two?

philcart SSC-Forever Points: 47794 More actions · Answer 13

I grabbed the stats you posted on the other thread.

For the wait stats its pretty cut and dried.

PAGEIOLATCH_SH 1130875600 24069458365 300005 126621887

You've accumulated over 278 days of IO wait time.

http://support.microsoft.com/kb/822101

IO latch: The IO latches are a subset of BUF latches that are used when the buffer and associated data page or the index page is in the middle of an IO operation. PAGEIOLATCH waittypes are used for disk-to-memory transfers and a significant waittime for these waittypes suggests disk I/O subsystem issues.

I've formatted the dm_io_virtual_file_stats output into a friendlier format. The excel sheet is attached.

Couple of significant points

1) The amount of Read/Write I/O by far exceeds the size of the data files. This would tend to indicate you're pushing more data around than needed. Most likely due to inefficient queries, lack of indexes, etc... Some tuning could reduce the I/O load significantly.

2) The data file for the my_rdr database is encountering a significant portion of I/O stall, but then it is doing 76% of the total I/O on the server.

If you're definitely going to get more hardware, I'd suggest a disk layout something along the lines of the following

(2 disks) RAID 1 - OS/Paging

(2 disks) RAID 1 - TempDB

(2 disks) RAID 1 (or RAID 10 if possible) - Other Database log (ldf) files

(4 disks) RAID 10 - Other database data (mdf) files

(4 disks) RAID 10 - my_rdr data file

(4 disks) RAID 10 - my_rdr Log File

(4 disks) RAID 5 (for the space) - Database backups

Thats a total of 22 disks. For a less optimal setup, your could reduce the disks by putting database backups with the OS/Paging array and only have (2 disk) RAID 1 for my_rdr log file. Most external enclosures have 14 slots. If you're thinking of connecting to a SAN, make sure that you have dedicated disks and HBAs to support your I/O

Your largest file is only 48GB so even 72GB drives would be sufficient. Always go for drives with high rpm like 15K rpm.

If you're prepared to throw money at the hardware, you could even look at getting some solid state drives.

Hope this helps some.

--------------------
Colt 45 - the original point and click interface

andrew.bruno SSC Veteran Points: 256 More actions · Answer 14

Wow! Great response 😀

Just wondering, anything I can do in the short term? cause umm.. how much is a lot of money? 🙂

For example, if I get an external disk, and use that for backups?

Or if I split the database. The largest DB has 200 customers, so in theory I can split to 100 customer in each db if need to.

I know I have a few varchar to nvarchar changes to make too, and some external index optimizations.. just trying to work which is the faster path to less timeouts.