Re: Intermittent slow database resposne

  • Just for my own information, why did you choose 7168 as the maximum server memory?

    7168MB / 1024 = 7GB

    You should have at least 1GB for the OS. Gila recommended 1.5.

    Any interesting on the performance counters?

  • DavidB (8/27/2008)


    Very sorry about that. The link I was referring to was http://support.microsoft.com/kb/918483.

    Not so sure that applies based on the errors that you provided.

    Have you installed the performance dashboard as recommended by an earlier response? I would be interested in seeing what you are seeing as your primary wait types.

    select * from sys.dm_os_wait_stats order by wait_time_ms desc

    I logged into the server this morning, and everything seems to be running ok. In other words, SQL server queries are running fast, and RDC is running at normal speed consistently. I didn't do anything to fix the problem, so I guess it "magically" fixed itself.

    Here's a snapshot summary of the current task manager/performance tab:

    PF Usage: 8.06GB

    Physical Memory (K)

    Total: 8383116

    Available: 208096

    System Cache: 676940

    Commit Charge (K)

    Total: 8458284

    Limit: 11761148

    Peak: 11022620

    Here's the current wait times from the performance dashboard:

    Wait Category Number of Waits Wait Time (sec) % Wait Time

    Sleep 92258898 132896.686 51.79%

    Parallelism 36760626 57328.218 22.34%

    Other 203254943 33374.994 13.01%

    Buffer IO 2558454 24789.561 9.66%

    Logging 3022637 2843.078 1.11%

    Lock 84328 2281.701 0.89%

    Latch 869246 1280.718 0.50%

    Scheduler Yield 16314131 1243.343 0.48%

    Network IO 26328 362.187 0.14%

    Buffer Latch 3763003 180.328 0.07%

    Memory 5061 1.625 0.00%

    Here's the currents results from sys.dm_os_wait_stats:

    wait_type waiting_tasks_count wait_time_ms max_wait_time_ms signal_wait_time_ms

    SQLTRACE_BUFFER_FLUSH 16216 64860578 4281 546

    LAZYWRITER_SLEEP 66871 64822687 1296 3078

    CXPACKET 36751505 57303687 532390 1913500

    PAGEIOLATCH_SH 2515588 23759968 4015 132578

    BROKER_TASK_STOP 2764 13695843 10015 500

    BROKER_RECEIVE_WAITFOR 36 7487453 600000 15

    IO_COMPLETION 597881 5658250 2406 10203

    WRITELOG 2936245 2669671 4437 116312

    OLEDB 201284544 1756562 52687 0

    LCK_M_U 83012 1738359 20000 2328

    BACKUPIO 1142998 1623281 2000 1500

    LATCH_EX 868884 1280375 906 186546

    SOS_SCHEDULER_YIELD 16044649 1237187 750 1234000

    SLEEP_TASK 92094968 1191093 437 1142906

    ASYNC_IO_COMPLETION 100 1147593 349718 0

    BACKUPBUFFER 222418 1116890 953 1890

    PAGEIOLATCH_EX 39569 904937 2421 203

    BACKUPTHREAD 999 814796 364468 15

  • I logged into the server this morning, and everything seems to be running ok. In other words, SQL server queries are running fast, and RDC is running at normal speed consistently. I didn't do anything to fix the problem, so I guess it "magically" fixed itself.

    Sounds like a good time to dig into looking at memory and disk performance.

    Also, what is the wait in the parallelism?

  • sam (8/27/2008)


    I logged into the server this morning, and everything seems to be running ok. In other words, SQL server queries are running fast, and RDC is running at normal speed consistently. I didn't do anything to fix the problem, so I guess it "magically" fixed itself.

    Sounds like a good time to dig into looking at memory and disk performance.

    Also, what is the wait in the parallelism?

    Here's the info on parallelism:

    Wait Type Number of Waits Wait Time (sec) % Wait time Max Wait Time (ms) Avg Wait Time (ms)

    CXPACKET 36847923 57496.859 100.00% 532390 1.6

  • sam (8/26/2008)


    http://www.google.com/search?q=troubleshooting+sql+memory+problems&sourceid=ie7&rls=com.microsoft:en-US&ie=utf8&oe=utf8

    The first link has a few counters you could look at.

    Is this the website that you were referring to regarding performance counters:

    http://sqlserver-qa.net/blogs/perftune/archive/2007/06/19/memory-issues.aspx

    Thanks again for all your input.

  • jlp3630 (8/27/2008)


    I'm running RAID-0 on my c: drive (contains Windows, SQL server, master, model and msdb databases and log files).

    I'm running RAID-50 on my f: drive (tempdb, user databases and log files).

    RAID 50? That's the first time I've heard of anyone using that Raid level.

    Just curious, why RAID 50?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (8/27/2008)


    jlp3630 (8/27/2008)


    I'm running RAID-0 on my c: drive (contains Windows, SQL server, master, model and msdb databases and log files).

    I'm running RAID-50 on my f: drive (tempdb, user databases and log files).

    RAID 50? That's the first time I've heard of anyone using that Raid level.

    Just curious, why RAID 50?

    The server was setup before I started working for this client. I have no idea why it's setup that way. When I find out, I'll let you know.

    Also, my c: drive is RAID-1 not RAID-0.

  • On our 64 bit servers we set the memory to allow 1.5 - 2 GB for the OS as Gail recommended and set Maxdop at 1 or 2. If you leave Maxdop at 0 or set it to some higher value (than 1) then set the cost threshold up to a higher value (10 or more).

    It looks like parallelism might be a problem based on your waitstats (did you reset them?).

  • Cliff Jones (8/28/2008)


    On our 64 bit servers we set the memory to allow 1.5 - 2 GB for the OS as Gail recommended and set Maxdop at 1 or 2. If you leave Maxdop at 0 or set it to some higher value (than 1) then set the cost threshold up to a higher value (10 or more).

    It looks like parallelism might be a problem based on your waitstats (did you reset them?).

    Cliff,

    Just for my own knowledge, could you explain the logic behind these two suggestions:

    1. set Maxdop to 1 or 2

    2. maxdop at 0 or higher than 1 & cost threshold > 10

    I didn't reset my wait stats when I posted these sys.dm_os_wait_stats results:

    wait_type waiting_tasks_count wait_time_ms max_wait_time_ms signal_wait_time_ms

    SQLTRACE_BUFFER_FLUSH 16216 64860578 4281 546

    LAZYWRITER_SLEEP 66871 64822687 1296 3078

    CXPACKET 36751505 57303687 532390 1913500

    PAGEIOLATCH_SH 2515588 23759968 4015 132578

    BROKER_TASK_STOP 2764 13695843 10015 500

    BROKER_RECEIVE_WAITFOR 36 7487453 600000 15

    IO_COMPLETION 597881 5658250 2406 10203

    I just reset them for further monitoring.

    Thanks for your input.

    Jon

  • We have found on our 64 bit servers (especially our Itanium) that if we leave Maxdop at 0, the server spends a lot of CPU time transferring the load from one CPU to another (context switching). We actually achieved better performance by setting Maxdop to 1 (which was actually what our HP vendor reccommended).

    On some of our other non-Itanium servers I found that I was able to home in on a proper value for Maxdop by tinkering with the Maxdop and the threshold values during times of high CPU utilization to home in on a better value. I did this by resetting the waitstats, let the server run for an hour or so and then rechecking the CXPacket waits and reducing the signal wait percentage. Start at Maxdop 1 for a benchmark, then Maxdop 2. At Maxdop = 2 increase the threshold to see if you can improve on the baseline.

    Did you install SQL 2005 from a fresh install (rather than an in place upgrade)?

  • I don't remember where I got this but here is the query I use to get the Signal Wait percentages:

    create table #waitstats (

    waittype varchar(80),

    requests numeric(20,1),

    waittime numeric (20,1),

    signalwaittime numeric(20,1))

    insert into

    #waitstats (waittype, requests, waittime, signalwaittime)

    select

    wait_type, waiting_tasks_count, wait_time_ms, signal_wait_time_ms

    from

    sys.dm_os_wait_stats

    declare @totalwait numeric(20,1),

    @totalsignalwait numeric(20,1) ,

    @endtime datetime,

    @begintime datetime

    --get the totals

    select

    @totalwait=sum(waittime), @totalsignalwait=sum(signalwaittime)

    from

    #waitstats

    --- subtract waitfor, sleep, and resource_queue from Total

    select

    @totalwait = @totalwait - sum(waittime), @totalsignalwait = @totalsignalwait - sum(signalwaittime)

    from

    #waitstats

    where

    waittype in ('Waitfor','Sleep','Resource_Queue')

    -- insert adjusted totals, rank by percentage descending

    insert into #waitstats select '***total***', 0, @totalwait, @totalsignalwait

    select

    waittype,

    waittime,

    percentage = cast(100 * waittime / @totalwait as numeric(20,1)),

    signalwaittime,

    percentagesw = cast(100 * signalwaittime / @totalsignalwait as numeric(20,1))

    from

    #waitstats

    where

    waittype not in ('waitfor','sleep','resource_queue','total')

    order by percentage desc

    select '%signal'= 100 * (@totalsignalwait / @totalwait)

    drop table #waitstats

  • Cliff Jones (8/28/2008)


    We have found on our 64 bit servers (especially our Itanium) that if we leave Maxdop at 0, the server spends a lot of CPU time transferring the load from one CPU to another (context switching). We actually achieved better performance by setting Maxdop to 1 (which was actually what our HP vendor reccommended).

    On some of our other non-Itanium servers I found that I was able to home in on a proper value for Maxdop by tinkering with the Maxdop and the threshold values during times of high CPU utilization to home in on a better value. I did this by resetting the waitstats, let the server run for an hour or so and then rechecking the CXPacket waits and reducing the signal wait percentage. Start at Maxdop 1 for a benchmark, then Maxdop 2. At Maxdop = 2 increase the threshold to see if you can improve on the baseline.

    Did you install SQL 2005 from a fresh install (rather than an in place upgrade)?

    Our current database server was loaded via a fresh SQL 2005 install. However, the majority of the databases were migrated from a SQL 2000 database server.

  • That's a good answer. If you upgraded then some of your default values may not be correct (like Max Worker Threads).

    My guess is that if you set both your min and max memory to allow 1.5 GB or more to the OS then your problem will go away, as long as you don't have some other application running on the server. When SQL Server gobbles up the memory beyond a certain point the server begins to thrash and CPU hits the roof.

    I have seen this several times with our clients servers.

Viewing 13 posts - 16 through 27 (of 27 total)

You must be logged in to reply to this topic. Login to reply