Recently we encountered a case where in a sql server which was running and hosting Biztalk instances was showing 100% CPU.
We started troubleshooting from the SQL end to identify who is major consumer of CPU and following were our observations and findings
· We used the perfmon counter Process %Processor time and confirmed that sqlservr.exe was the major consumer of the CPU.
· Since the recommendation for Biztalk server is to have MAXDOP 1 we ruled out parallelism as a cause of HIGH CPU.
· We checked the perfmon counter Sql Server: SQL Statistics , Batch Requests/sec , Sql Compilations/sec and SQL Recompilations/sec but we could not see any major compilations or recompilations/sec so even that was ruled as a cause of the High CPU
· Finally we started looking at the queries which are major consumers of CPU. We used the following queries and queried the DMvs sys.dm_exec_requests and sys.dm_exec_query_stats to find the queries consuming high CPU
SELECT st.text, r.cpu_time FROM sys.dm_exec_requests r
CROSS APPLY sys.dm_exec_sql_text(sql_handle) AS st
ORDER BY cpu_time DESC
SELECT TOP 5 total_worker_time/execution_count AS [Avg CPU Time],
WHEN -1 THEN DATALENGTH(st.text)
END – qs.statement_start_offset)/2) + 1) AS statement_text
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
ORDER BY total_worker_time/execution_count DESC;
· We found that following two stored procedures were one of the major consumers of CPU resources while all the other queries on the servers were getting blocked and waiting on SOS_SCHEDULER_YIELD
· We consulted our Biztalk team and found that the first stored proc is used for performance data collection which is fired every 5 secs while second stored procedure is used to dequeue messages coming from Biztalk
· We further drilled the stored procedure [DBO].[BTS_DEQUEUEMESSAGES_BIZTALKSERVERISOLATEDHOST] and found it was using a cursor to query table
[BizTalkServerIsolatedHostQ] which was causing blocking on the server.
DECLARE btscurse CURSOR FAST_FORWARD FOR
SELECT TOP 20 w.uidWorkID, w.uidInstanceID, w.uidMessageID, w.snPartRetrieval,
w.fOrderedDelivery, w.snPriority, w.dtLastTouched, w.nID, i.uidProcessID,
i.uidActivationID, i.nState FROM [BizTalkServerIsolatedHostQ] AS w
WITH (UPDLOCK ROWLOCK READPAST) INNER LOOP JOIN Instances AS i
WITH (UPDLOCK ROWLOCK READPAST) ON ( w.uidInstanceID = i.uidInstanceID AND uidAppOwnerID = N‘85459549-B70C-4525-B1E6-72D3C0125E52’) AND ( i.uidProcessID IS NULL OR i.uidProcessID = @uidProcessID )
· We queried the table BizTalkServerIsolatedHostQ and found that there were 581683 messages which were undelivered which were unusual. SO it indicates some of the Orphaned records might be cached in the table which are zombie
In order to resolve the issue we used the following Action Plan
· In order to minimize the SQL CPU usage due to the stored procedure MSGBOXPERFCOUNTERS_GETSPOOLSIZE , we set the following registry key to 30sec
HKLMSoftwareMicrosoftBizTalk Server3.0MsgBox Performance CacheRefreshInterval to 30 seconds
IMPORTANT : This resolution contains information about modifying the registry. Before you modify the registry, make sure to back it up and make sure that you understand how to restore the registry if a problem occurs. For information about how to back up, restore, and edit the registry, click the following article number to view the article in the Microsoft Knowledge Base:
256986 Description of the Microsoft Windows Registry
After the above changes the CPU on the sql server was resumed to normal.
Hope this help!! 🙂
Premier Field Engineer, Microsoft