SQL Server 2005 self restart

Question

Post reply

SQL Server 2005 self restart

imarchenko-236376

SSCommitted

Points: 1734
More actions
February 22, 2006 at 5:15 am

#111672

Hello!

We have configured SQL Server 2005 64 bit (8 CPU/28GB RAM&nbsp/Windows 2003 Active/Passive cluster. Besides the fact that SQL Server Management Studion 32 bit is very slow on 64 bit computer (we are rdp to the box), we are also experiencing SQL Server self-restart problem when running DBCC DBREINDEX (or ALTER INDEX ALL ON using new syntax). SQL server is generating dump and self restarts. The dump reports issues similar to:

Process 54:0:5 (0x1494) Worker 0x00000000820001C0 appears to be non-yielding on Scheduler 3. Thread creation time: 12784883544792. Approx Thread CPU Used: kernel 78 ms, user 0 ms. Process Utilization 0%. System Idle 89%. Interval: 70000 ms.

Process 54:0:7 (0xa68) Worker 0x0000000081BDE1C0 appears to be non-yielding on Scheduler 4. Thread creation time: 12784883544808. Approx Thread CPU Used: kernel 46 ms, user 0 ms. Process Utilization 0%. System Idle 88%. Interval: 110204 ms.

.

.

.

Process 54:0:7 (0xa68) Worker 0x0000000081BDE1C0 appears to be non-yielding on Scheduler 4. Thread creation time: 12784883544808. Approx Thread CPU Used: kernel 62 ms, user 0 ms. Process Utilization 0%. System Idle 87%. Interval: 104782 ms.

The connection has been lost with Microsoft Distributed Transaction Coordinator (MS DTC). Recovery of any in-doubt distributed transactions involving Microsoft Distributed Transaction Coordinator (MS DTC) will begin once the connection is re-established. This is an informational message only. No user action is required.

Service Broker manager has shut down.

and finally:

SQL Server is terminating in response to a 'stop' request from Service Control Manager. This is an informational message only. No user action is required.

I was wondering if anyone has experienced problems similar to described above.

Thanks,

Igor

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply

Joseph Mulhall SSCarpal Tunnel Points: 4672 More actions · Answer 1

Joseph Mulhall

SSCarpal Tunnel

Points: 4672

February 23, 2006 at 2:38 am

#622836

Quote looks like the problem is with MSDTC and the cluster, not the command you're sending.

Try removing MSDTC (msdtc -uninstall) from both nodes - you'll need to break the cluster to do this, then reinstall msdtc.

Bruce Van Buren Valued Member Points: 51 More actions · Answer 2

Igor,

I have seen this type of error on a 64 bit 2000 box but it was resolved in sp4. But not on 2005. One thing I have noticed on SQL Sever is that reindexing can consume a lot of resources. What I would try is

Lower your max degree of parallelism to 2 tp 4 so that the reindex only uses 2-4 processors. I have a 8 way and 4 is my setting
leave 1.5 or 2 gigs of memory or more for the OS depending on what else you have going on the box (RS, SQL Agent etcc..) Try booping it down to 20 gigs and see if it solves the problem then ease it back up.

Hope this helps (let us know if it does!)

imarchenko-236376 SSCommitted Points: 1734 More actions · Answer 3

Bruce,

The original problem with DBCC DBREINDEX was caused by problem with Active node. Once we failed over to the different node, I managed to finish rebuilding remaining tables.

Of course, there is somethig wrong with the cluster. Just yesterday, SQL Server self restarted. I am currently torubleshooting this issue and try to find the cause.

I looked into SQL Server Error/Event log and noticed errors/warnings logged prior to restart related to:

1. Virtual Memory being too low

2. The connection has been lost with Microsoft Distributed Transaction Coordinator (MS DTC)

3. SSPI handshake failed with error code 0x8009030c while establishing a connection with integrated security; the connection has been closed.

I was wondering what virtual memory size would you recommend given we have 28GB of RAM. 3 seems to be related to SQL Server cluster not being able to verify user's identity/certificate. We did reserver 1.5 GB for OS.

We following your recommendation and changed max degree of parallelism. In our case it has been set to 1.

Thanks,

Igor

Joseph Mulhall SSCarpal Tunnel Points: 4672 More actions · Answer 4

Joseph Mulhall

SSCarpal Tunnel

Points: 4672

February 24, 2006 at 1:43 am

#623130

How are your nodes connected, and how far apart are they?

Bruce Van Buren Valued Member Points: 51 More actions · Answer 5

I have had this problem with a cluster before. The machine becomes starved for resources and the heartbeat was not detected and failover was initiated by the cluster. You may also want to reserve one processor for the OS and limit SQL to 7 processors. In 2000 processor 0 was designated for I/0 and network and other system functions.

I do not know in SQL 2005. In the end you are trying to leave enough resources for the system and cluster no matter what SQL does. Identity verification problems are another symptom of resource constraint problems. With regard to the swap file 3 gigs sounds good but make sure it is not on the same drives as your data / log files.

Thanks

Jennifer McConnell SSC Rookie Points: 32 More actions · Answer 6

Jennifer McConnell

SSC Rookie

Points: 32

March 10, 2006 at 9:38 am

#626023

We are experiencing a very similar problem. Have you found a resolution yet?

imarchenko-236376 SSCommitted Points: 1734 More actions · Answer 7

imarchenko-236376

SSCommitted

Points: 1734

March 10, 2006 at 10:57 am

#626041

Jennifer,

There as an problem with network card in our case that was fixes by Net Engineer.

Igor

Todd Nordquist Newbie Points: 7 More actions · Answer 8

Todd Nordquist

Newbie

Points: 7

March 10, 2006 at 1:00 pm

#626082

I'm the network engineer at Jennifer's company. Please enlighten me as to the exact cause, it would be much appreciated.

imarchenko-236376 SSCommitted Points: 1734 More actions · Answer 9

imarchenko-236376

SSCommitted

Points: 1734

March 10, 2006 at 2:43 pm

#626098

Sorry, out network guys in on vacation. If I recall correctly, it had to do with HP hardware specific network card reconfiguration.

Igor

cpeck523 Old Hand Points: 360 More actions · Answer 10

cpeck523

Old Hand

Points: 360

May 9, 2006 at 9:12 am

#636782

We saw a simillar situation. The anti-virus software was causing a disk to not pass consistancy checks and caused a failover to the next node in the cluster. Upgrading the anti-virus corrected this.

TDuffy SSCarpal Tunnel Points: 4170 More actions · Answer 11

We also had a restart problem, and none of the prior ideas worked. It seems that Terminal Server was causing a reduction in the working set size in all windows processes (inc. SQL Server). That caused the connection between Cluster Admin and SQL to fail, and prevented Cluster Admin from reconnecting, so Cluster Admin was sending a restart to SQL as it is configured to do.

See this http://support.microsoft.com/kb/905865/en-us article for details and fix.

Terry

DBA-MANAMADURAI SSC Veteran Points: 222 More actions · Answer 12

DBA-MANAMADURAI

SSC Veteran

Points: 222

October 4, 2006 at 9:48 am

#664125

Is your Problem got fixed after applying the shot fix?.We are also having the same issue.Thanks Much.

DBA

TDuffy SSCarpal Tunnel Points: 4170 More actions · Answer 13

TDuffy

SSCarpal Tunnel

Points: 4170

October 4, 2006 at 9:58 am

#664127

Yes the problem was resolved with the hot fix.

Terry

Balmukund SSCertifiable Points: 6648 More actions · Answer 14

My 2 cents....

1. Its 64 Bit SQL Server?

2. Its HP Box?

3. Its x64 CPU?

If above all are true then follow below two things.

1. Follow KB:

918483 You can enable the lock pages in memory permissions to prevent SQL Server 2005 64-bit buffer pool memory from being paged out of physical memory

http://support.microsoft.com/default.aspx?scid=kb;EN-US;918483

2. Check version of

C:\WINDOWS\SYSTEM32\DRIVERS\CPQCIDRV.SYS

if it is 1.7:3790.0

then contact HP to get fix for this driver

http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&objectID=c00688313&jumpid=reg_R1002_USEN