SQL Server 2005 self restart

  • Hello!
     
       We have configured SQL Server 2005 64 bit (8 CPU/28GB RAM&nbsp/Windows 2003   Active/Passive cluster. Besides the fact that SQL Server Management Studion 32 bit is very slow on 64 bit computer (we are rdp to the box), we are also experiencing SQL Server self-restart problem when running DBCC DBREINDEX (or ALTER INDEX ALL ON using new syntax). SQL server is generating dump and self restarts. The dump reports issues similar to:
     
    Process 54:0:5 (0x1494) Worker 0x00000000820001C0 appears to be non-yielding on Scheduler 3. Thread creation time: 12784883544792. Approx Thread CPU Used: kernel 78 ms, user 0 ms. Process Utilization 0%. System Idle 89%. Interval: 70000 ms.
     
    Process 54:0:7 (0xa68) Worker 0x0000000081BDE1C0 appears to be non-yielding on Scheduler 4. Thread creation time: 12784883544808. Approx Thread CPU Used: kernel 46 ms, user 0 ms. Process Utilization 0%. System Idle 88%. Interval: 110204 ms.
    .
    .
    .
    Process 54:0:7 (0xa68) Worker 0x0000000081BDE1C0 appears to be non-yielding on Scheduler 4. Thread creation time: 12784883544808. Approx Thread CPU Used: kernel 62 ms, user 0 ms. Process Utilization 0%. System Idle 87%. Interval: 104782 ms.
    The connection has been lost with Microsoft Distributed Transaction Coordinator (MS DTC). Recovery of any in-doubt distributed transactions involving Microsoft Distributed Transaction Coordinator (MS DTC) will begin once the connection is re-established. This is an informational message only. No user action is required.
    Service Broker manager has shut down.
     
    and finally:
     
    SQL Server is terminating in response to a 'stop' request from Service Control Manager. This is an informational message only. No user action is required.
     
    I was wondering if anyone has experienced problems similar to described above.
     
    Thanks,
    Igor

     

  • Quote looks like the problem is with MSDTC and the cluster, not the command you're sending.

    Try removing MSDTC (msdtc -uninstall) from both nodes - you'll need to break the cluster to do this, then reinstall msdtc.

  • Igor,

    I have seen this type of error on a 64 bit 2000 box but it was resolved in sp4.  But not on 2005.  One thing I have noticed on SQL Sever is that reindexing can consume a lot of resources.  What I would try is

    1. Lower your max degree of parallelism to 2 tp 4 so that the reindex only uses 2-4 processors.  I have a 8 way and 4 is my setting
    2. leave 1.5 or 2 gigs of memory or more for the OS depending on what else you have going on the box (RS, SQL Agent etcc..)  Try booping it down to 20 gigs and see if it solves the problem then ease it back up.

    Hope this helps (let us know if it does!)

  • Bruce,

       The original problem with DBCC DBREINDEX was caused by problem with Active node. Once we failed over to the different node, I managed to finish rebuilding remaining tables.

        Of course, there is somethig wrong with the cluster. Just yesterday, SQL Server self restarted. I am currently torubleshooting this issue and try to find the cause.

    I looked into SQL Server Error/Event log and noticed errors/warnings logged prior to restart related to:

    1. Virtual Memory being too low

    2. The connection has been lost with Microsoft Distributed Transaction Coordinator (MS DTC)

    3. SSPI handshake failed with error code 0x8009030c while establishing a connection with integrated security; the connection has been closed.

       I was wondering what virtual memory size would you recommend given we have 28GB of RAM. 3 seems to be related to SQL Server cluster not being able to verify user's identity/certificate. We did reserver 1.5 GB for OS.

       We following your recommendation and changed max degree of parallelism. In our case it has been set to 1. 

     

    Thanks,

    Igor

  • How are your nodes connected, and how far apart are they?

  • I have had this problem with a cluster before.  The machine becomes starved for resources and the heartbeat was not detected and failover was initiated by the cluster.  You may also want to reserve one processor for the OS and limit SQL to 7 processors. In 2000 processor 0 was designated for I/0 and network and other system functions.

    I do not know in SQL 2005.  In the end you are trying to leave enough resources for the system and cluster no matter what SQL does.  Identity verification problems are another symptom of resource constraint problems.  With regard to the swap file 3 gigs sounds good but make sure it is not on the same drives as your data / log files.

     

    Thanks

  • We are experiencing a very similar problem.  Have you found a resolution yet?

  • Jennifer,

    There as an problem with network card in our case that was fixes by Net Engineer.

     

    Igor

     

  • I'm the network engineer at Jennifer's company.  Please enlighten me as to the exact cause, it would be much appreciated.

  • Sorry, out network guys in on vacation. If I recall correctly, it had to do with HP hardware specific network card reconfiguration.

    Igor

  • We saw a simillar situation. The anti-virus software was causing a disk to not pass consistancy checks and caused a failover to the next node in the cluster. Upgrading the anti-virus corrected this.

  • We also had a restart problem, and none of the prior ideas worked. It seems that Terminal Server was causing a reduction in the working set size in all windows processes (inc. SQL Server). That caused the connection between Cluster Admin and SQL to fail, and prevented Cluster Admin from reconnecting, so Cluster Admin was sending a restart to SQL as it is configured to do.

    See this http://support.microsoft.com/kb/905865/en-us article for details and fix.

    Terry

     

     

  • Is your Problem got fixed after applying the shot fix?.We are also having the same issue.Thanks Much.

     


    DBA

  • Yes the problem was resolved with the hot fix.

    Terry

  • My 2 cents....

     

    1. Its 64 Bit SQL Server?

    2. Its HP Box?

    3. Its x64 CPU?

    If above all are true then follow below two things.

    1. Follow KB:

    918483 You can enable the lock pages in memory permissions to prevent SQL Server 2005 64-bit buffer pool memory from being paged out of physical memory

    http://support.microsoft.com/default.aspx?scid=kb;EN-US;918483

    2. Check version of

    C:\WINDOWS\SYSTEM32\DRIVERS\CPQCIDRV.SYS

    if it is 1.7:3790.0

    then contact HP to get fix for this driver

    http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&objectID=c00688313&jumpid=reg_R1002_USEN 

     

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply