SQL2008R2 and blocking using sp_setapprole and sp_unsetapprole

  • We recently migrated from 32-bit SQL2005 SP2 (single server) to 64-bit SQL0208R2 RTM (5-node cluster) for one of our main applications. Since the upgrade, we've been seeing some very odd behavior. Setting and unsetting the Application Role at the beginning and end of transactions are causing blocking on the System Resource Database's (database ID 32767) schema (Sch-M). In sp_who2, the lead blocker is always suspended with a command of 'UNKNOWN TOKEN' or 'SETUSER'. After 1-10 seconds, the block resolves and the application just seems unbearably sluggish to the users, but during times of peak activity the blocked processes pile upon each other and every user times out. The application is a VB.NET (.NET 2.0) forms application. Yuck, I know. It uses connection pooling and application roles. It was running fine in 2005 and our rather exhaustive testing did not uncover this issue. We're unable to duplicate it in test, either , with 15 test users banging away at the app as fast as they could.

    More info:

    Sample blocked process report:

    <blocked-process-report>

    <blocked-process>

    <process id="processa9d708" taskpriority="0" logused="0" waitresource="METADATA: database_id = 6 DATABASE_PRINCIPAL(principal_id = 116)" waittime="1223" ownerId="209005460" transactionname="UnSetAppRole" lasttranstarted="2011-11-16T14:07:09.387" XDES="0x8001f9c0" lockMode="Sch-M" schedulerid="6" kpid="10960" status="suspended" spid="105" sbid="0" ecid="0" priority="0" trancount="1" lastbatchstarted="2011-11-16T14:07:09.387" lastbatchcompleted="2011-11-16T14:07:09.387" clientapp="Infinity" hostname="29YB5J1" hostpid="1080" loginname="OURDOMAIN\USER1" isolationlevel="read committed (2)" xactid="209005459" currentdb="6" lockTimeout="4294967295" clientoption1="673185824" clientoption2="128056">

    <executionStack>

    <frame line="22" stmtstart="950" stmtend="1114" sqlhandle="0x0300ff7f58d3d430aeb418014c9d00000100000000000000"/>

    </executionStack>

    <inputbuf>

    Proc [Database Id = 32767 Object Id = 819254104] </inputbuf>

    </process>

    </blocked-process>

    <blocking-process>

    <process status="running" spid="100" sbid="0" ecid="0" priority="0" trancount="1" lastbatchstarted="2011-11-16T14:07:09.387" lastbatchcompleted="2011-11-16T14:07:09.387" clientapp="Infinity" hostname="69RDWH1" hostpid="5836" loginname="OURDOMAIN\USER2" isolationlevel="read committed (2)" xactid="209005456" currentdb="6" lockTimeout="4294967295" clientoption1="673185824" clientoption2="128056">

    <executionStack>

    <frame line="22" stmtstart="950" stmtend="1114" sqlhandle="0x0300ff7f58d3d430aeb418014c9d00000100000000000000"/>

    </executionStack>

    <inputbuf>

    Proc [Database Id = 32767 Object Id = 819254104] </inputbuf>

    </process>

    </blocking-process>

    </blocked-process-report>

    Looking at system waits, LCK_M_SCH_S & LCK_M_SCH_M are #2 and #3 respectively.

    We've seen no indication of hardware bottlenecks at any level (CPU, IO, Memory, network, etc...)

    We have a Sev A incident open with Microsoft, but after 24+ hours they've been of little help. They suggested that we implement trace flag 4616, which we'll be doing after hours tonight and keeping our fingers crossed.

    Has anyone else seen similar behavior?

    Thanks!

    -Ken

  • Wanted to follow up on this in case anyone else runs into it.

    We were re-using connections using connection pooling, but setting and un-setting the application role before each transaction. This was causing the TokenAndPermUserStore in the cache to grow out of control. The system was paging through this cache each time were were calling sp_setapprole or sp_unsetapprole, but this region of cache had grown to over 5GB. MS told us that performance typically starts to degrade when it grows over about 300MB or so.

    As a short-term fix we flushed the cache using:

    DBCC FREESYSTEMCACHE (‘TokenAndPermUserStore’)

    and kept an eye on the size for a while using

    SELECT SUM(single_pages_kb + multi_pages_kb) AS "SecurityTokenCacheSize(kb)"

    FROM sys.dm_os_memory_clerks

    WHERE name = 'TokenAndPermUserStore'

    We saw that during peak periods, the cache would grown over 500MB 5 minutes or so, so we set up an agent job to run flush the cache every 5 minutes.

    As a long-term fix, we modified our application to remove the use of Application Roles. A major pain, but the problem is now gone. Given how we were using application roles, we were not seeing any benefit from them anyhow.

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply