SQL Server Crashed

  • Hi All,

    I am trying to find out the root cause for SQL Server crash.

    A timeout (30000 milliseconds) was reached while waiting for a transaction response from the MSSQLSERVER service.

    The SQL Server (MSSQLSERVER) service terminated unexpectedly. It has done this 5 time(s).

    AutoRestart: Unable to restart the MSSQLSERVER service (reason: An instance of the service is already running)

    This file is generated by Microsoft SQL Server

    version 12.0.4213.0

    upon detection of fatal unexpected error. Please return this file,

    the query or program that produced the bugcheck, the database and

    the error log, and any other pertinent information with a Service Request.

    Computer type is Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz.

    Bios Version is Xen - 0

    Revision: 1.221

    32 X64 level 8664, 10 Mhz processor (s).

    Windows NT 6.2 Build 9200 CSD .

    Memory

    MemoryLoad = 94%

    Total Physical = 249999 MB

    Available Physical = 14127 MB

    Total Page File = 267338 MB

    Available Page File = 28793 MB

    Total Virtual = 134217727 MB

    Available Virtual = 133598094 MB

    ***Stack Dump being sent to D:\tempdataroot\MSSQL12.MSSQLSERVER\MSSQL\LOG\SQLDump0039.txt

    SqlDumpExceptionHandler: Process 159 generated fatal exception c0000005 EXCEPTION_ACCESS_VIOLATION. SQL Server is

    terminating this process.

    * *******************************************************************************

    *

    * BEGIN STACK DUMP:

    * 07/29/16 16:21:52 spid 159

    *

    *

    * Exception Address = 000000007A93E615 Module(cwbodbc+000000000005E615)

    * Exception Code = c0000005 EXCEPTION_ACCESS_VIOLATION

    * Access Violation occurred reading address 000000006029E2CA

    * Input Buffer 48 bytes -

    * _GetTransfers

    *

    This dump is generated frequently. The SP name changes.

    We have a linked server which gets the data from AS400 using cwbodbc(IBM iSeries) driver.

    This is a primary node of AG setup. After primary node failed the automatic failover kicked in but dint succeed.

    Following errors were reported.

    The lease between availability group 'P1AG' and the Windows Server Failover Cluster has expired. A connectivity issue occurred between the instance of SQL Server and the Windows Server Failover Cluster. To determine whether the availability group is failing over correctly, check the corresponding availability group resource in the Windows Server Failover Cluster

    AlwaysOn Availability Groups connection with secondary database terminated for primary database 'a' on the availability replica 'P2' with Replica ID: {}. This is an informational message only. No user action is required.

    The availability group database "a" is changing roles from "PRIMARY" to "RESOLVING" because the mirroring session or availability group failed over due to role synchronization. This is an informational message only. No user action is required.

    Unable to access availability database 'a' because the database replica is not in the PRIMARY or SECONDARY role. Connections to an availability database is permitted only when the database replica is in the PRIMARY or SECONDARY role. Try the operation again later.

    Please help me troubleshoot this.

  • Quick questions, what else is running on the server? What are the sql server memory configurations? Any relevant entries in the Windows Event Log? What is the storage configuration? Any relevant log entries from the IO subsystem? Any errors in the BIOS log?

    😎

    SQL Server is entirely dependent on the host OS for IO including disk/mem/network, likely there are indications in the logs of those sub-systems.

  • SQL Server Memory : 220 GB

    Total memory: 244 GB

    Apart from SQL Server there is C# service which runs a bunch of Stored procedures and gets the data from AS400.

    There is nothing much in Windows Event viewer.

    The lease between availability group '' and the Windows Server Failover Cluster has expired. A connectivity issue occurred between the instance of SQL Server and the Windows Server Failover Cluster. To determine whether the availability group is failing over correctly, check the corresponding availability group resource in the Windows Server Failover Cluster.

    Cluster resource '' of type 'SQL Server Availability Group' in clustered role '' failed.

    Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.

    No matching network interface found for resource '' IP address '' (return code was '5035'). If your cluster nodes span different subnets, this may be normal.

  • With Access Violations causing SQL to terminate, you may be best off opening a case with Microsoft's Customer Support if it's a repeated crash. CSS have tools to read through stack dumps and identify the root cause.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Sure thanks. I will do that.

  • I also noted wait type OLEDB just before crash.

    wait_info

    (596ms)OLEDB

    (769ms)OLEDB

    (926ms)OLEDB

    (183ms)OLEDB

    (848ms)OLEDB

    (1977ms)OLEDB

    (83ms)OLEDB

    (1324ms)OLEDB

    (274ms)OLEDB

    (623ms)OLEDB

    I am guessing its the cwbodbc driver which caused the crash.

  • Could well be, but many DMVs internally use OLEDB, so not definitive.

    See if there's an updated driver and, if there is, try that?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • This is the output from whoisActive for all the sessions with OLEDB wait

    sql_text: INSERT INTO @P

    EXEC (@SQL) AT linkedservername

    I will try getting the driver updated.

  • I had an issue in the past where a SQL server 2012 always on availability group would fail and cause SQL Server to crash with the error message "access violation".

    This was when adding a new database into the availability group so I am not sure it is the same as your issue, but anyway the issue was that someone had created databases from scripts and had somehow created 2 databases with the same service broker GUID, once we dropped the database and recreated; a new GUID was issued and the problem went away.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply