posted Tuesday, September 2, 2008
As a SQL Server administrator you are probably familiar with this scenario: The phone rings and it is one of your users saying “the system seems really slow today”, or if you have monitoring and alerting set up you get an email telling you the CPU(s) on your SQL server is maxed at 100%. Regardless of the method of alert, you have to figure out what is causing the problem. You investigate and find that the sqlservr.exe process is using most, if not all of the available CPU cycles. Beyond that however, you have no idea of exactly what in SQL Server is causing the CPU to be pegged at 100%.
When faced with the issue of identifying the exact runaway process or statement that is bringing your SQL Server to its knees you will find that there is no easy solution. There is no single place in the provided tools to dynamically list the percent of CPU taken by each SQL Server thread process.
Even identifying the specific SQL Server process leaves you grasping for more information. You really want to know the exact code or statement that is soaking up precious CPU cycles. Using the techniques described in this paper you will be able to identify the culprit and react appropriately, adding efficiency and value to your organization.
The purpose of this paper is to describe the methods that can be used to determine what SQL Server process is consuming CPU cycles. Through the use of Performance Monitor, Query Analyzer, Enterprise Manager and Profiler, tools provided with Microsoft Windows and Microsoft SQL Server, the administrator can pinpoint the specific query or operation that is causing high CPU.
Step-by-step instructions, including screen captures, detailing the setup and execution are provided to enable the system administrator to quickly and accurately determine the exact cause and start him on the correct course to fix the problem. A special section addressing cursors and stored procedures is included to deal with situations where the statement causing the problem is not readily apparent as is common in applications such as JD Edwards EnterpriseOne.
A bonus section is included with additional tips to make the methods described in the paper quick and easy to repeat. Several additional tips related to Oracle JD Edwards EnterpriseOne are included.
Total Pages - 22
When a high CPU condition caused by SQL Server presents itself on your server, there is no quick, simple way to determine exactly what SQL Server operation is causing the problem.
The steps to get to the answer are not readily apparent, not integrated and do not lend themselves to an easy approach to a solution. We simply need a way to pinpoint the exact operation within SQL Server that is causing the high CPU condition.
To achieve our goal of determining the statement or operation causing high SQL Server CPU we will make use of the multiple SQL Server tools necessary to get to the detail level we desire.
We will follow a troubleshooting trail from Performance Monitor to Query Analyzer to Enterprise Manager to Profiler to get to the bottom of the problem. When completed, you will be able to quickly determine which operation is causing the high CPU problem.
Each step in the solution has detailed setup and execution instructions, complete with screen shots and descriptions.
In order to determine exactly what statement or operation is causing high CPU on the SQL server, it is necessary to follow a path that looks like this:
To get from the application that can tell you there is a high CPU issue (Performance Monitor) to where you can view the command causing the issue (Enterprise Manager or Profiler) you must follow the trail mentioned above.
To view the statement or operation causing the problem, it is necessary to know the SPID (Server Process ID). To get the SPID you must know the KPID (Kernel Process ID). To get the KPID you must have the ID Thread. To get the ID Thread, you must know the Thread or Instance number. So, you follow the trail above to achieve the goal of determining the exact command causing your high CPU state.
Note: See comments at the end of the article if you would like to use Sysinternals' Process Explorer to determine the KPID instead of perfmon. It is indeed a little easier.
In Performance Monitor, you will view the SQL threads' CPU consumption. Open perfmon and add the appropriate counters by selecting the “thread” performance object, the counter object “% Processor Time”, and all of the “sqlservr” instances.
Figure 1 Adding thread instances
This will display %Processor Time by thread instance number and allow you to identify the high CPU thread.
The Performance Monitor window will show each instance as a line in graph mode or as bars in histogram mode. The instance counters will be listed at the bottom as well.
Figure 2 Perfmon showing thread instances
Start a second session of perfmon, and click the View/Report button. Then add the performance object “thread”, the counter object “ID Thread”, and all the “sqlservr” instances.
Figure 3 Adding ID threads
This will appear as a tabular report in Performance Monitor with the ID thread number listed below the corresponding thread instance number.
Figure 4 Perfmon showing thread-ID to thread correlation
When the occasion arises where you must find out what SQL Server SPID is causing the high CPU, execute the steps below.
In Performance Monitor session #1 find the thread instance that shows sustained high CPU %. Note that on a busy system there may be several thread instances showing high CPU %. If you are troubleshooting a long-running process, the obvious thread will show after a prolonged period of watching. You may need to switch to histogram view in Performance Monitor if there are a large number of active SQL Server threads. Note that it may not be necessary to choose one thread as the process can be spread among multiple threads that all correlate back to a single SPID. For example you may find that instance 28 and instance 50, while having different ID Threads, may have the same SPID and the SPID is what we are trying to find.
Once you choose a thread instance, double-click on the graph line (or bar in histogram view), highlighting the instance in the bottom panel and enabling you to identify the insance number. Note the instance number, in this case instance 28.
Figure 5 Perfmon showing high CPU thread selected
Switch to performance monitor #2 and find the instance number noted above in the Thread row. Note the ID Thread listed below the thread. In this case the thread is 28 and the ID Thread is 2936.
Figure 6 Perfmon showing thread-ID to thread correlation
The ID Thread correlates with KPID (Kernel Process ID). Once we have the KPID we are one step closer to our goal of getting the SPID of the high CPU SQL process. The KPID to SPID relationship can be determined by executing the following SQL statement on the affected server while the offending process is running:
select spid, kpid, status, hostname, dbid, cmd from master..sysprocesses
where kpid != 0 order by kpid
The results show which SPID (Server Process ID) correlates with the KPID. In our case KPID 2936 is associated with SPID 111.
Figure 7 Query Analyzer results correlating KPID-SPID
With this information you can now use Enterprise Manger, sp_who2 or other tools to find out more about what the SPID is doing.
In Enterprise Manager select the appropriate SQL Server then expand Management/Current Activity/Process Info. In the list of Process IDs find the SPID you determined in the previous step. You can double-click the SPID record to see exactly what the last batch command was. In the case of SPID 111, we see that a select statement was being run against CRPDTA.F0911, a table with over 8 million records, causing significant CPU usage by SQL Server.
Figure 8 Enterprise Manager showing process' last command
An issue arises however, if the activity you are trying to identify is typical of EnterpriseOne, which makes extensive use of cursors. When correlating back to a SPID, the activity you will likely see is a stored procedure cursor command such as sp_cursorprepexec or sp_cursorfetch instead of an actual SQL statement. All this indicates is that a cursor is being executed and tells you little about the underlying activity that is causing your CPU increase.
Figure 9 Enterprise Manager showing cursor operation
You can still get to the root of your problem if the SPID correlates back to a cursor; it will just take a little more work.
To find what activity is associated with the cursor operation you will use SQL Server Profiler, a performance analysis tool that comes with SQL Server and can be used to trace events and activities on your SQL server.
Since Profiler tracing adds overhead, we must very narrowly define our criteria for analysis in Profiler in order to not overburden the server being monitored. Since we are still operating from the assumption that we are searching for the root cause of an activity causing high CPU on SQL Server, it would not be helpful to add to the load unnecessarily.
Open Profiler and select New/Trace from the File menu. Select the affected server from the dropdown list.
On the “General” tab enter the information about your trace. Name your trace, select Template Name “Blank”, select a location to save the trace and set the maximum rows. Since we have the option to save the trace information to a table we shall do so.
Figure 10 Configuring Profiler trace
Select the “Events” tab and add the following events:
Cursors: CursorClose, CursorExecute, CursorOpen,
Performance: Show Plan Text
Stored Procedures: RPC:Completed, RPC:Starting, SP: StmtCompleted, SP: StmtStarting
Figure 11 Configuring Profiler events
These are the events that we will trace in Profiler. Select the “Data Columns” tab and add the following to the “Columns” section:
Start Time, BinaryData, TextData. These values will join EventClass and SPID which are already present.
Figure 12 Configuring Profiler Columns
These are the data items that will be visible in the trace results.
Select the “Filters” tab to set up Profiler to only show the events from the SPID that we discovered in the earlier sections to be the offending process. In the example picture we will presume that the offending SPID was found to be 111. Enter 111 in the SPID/Equals filter to see only events associated with SPID 111.
Figure 13 Configuring Profiler filter
Click the Run button and you should see data scrolling by. If the operation you are trying to trace has already gone into the use of cursors you may have difficulties telling exactly what is causing the high CPU. If you are fortunate, a quick examination of the trace results will point to the exact cause.
The trace below shows several select statements interspersed with the cursor executions.
Figure 14 Profile trace results
You should be able to analyze the trace results to determine your high CPU culprit, even if it involves a cursor action. Some detective work may be required but it is still much simpler than guessing what SQL Server process is causing your problem.
Using the methods provided in the solution detailed, an administrator will be able to pinpoint the cause of high CPU in SQL Server.
This solution will be quick, efficient, repeatable and will work regardless of whether the operation causing the increased CPU usage is a simple SQL statement or a stored procedure.
The SQL Server administrator now possesses a reliable method to help answer the question: “Why does the system seem slow today”
To allow fast reaction time to a high CPU condition you can save the Performance Monitor console settings once you have set them as described in the document. By saving each perfmon session you can quickly open the .msc file and begin observing the SQL Server activity.
By also saving the SQL statement that correlates the KPID to SPID you can also quickly execute it.
You can save the Profiler template to a file that preserves the settings for quick access. Opening the Profiler template and changing the SPID ID filter to the appropriate SPID is all you have to do to start capturing transaction information.
The KPID can be associated back to a EnterpriseOne batch job or OneWorld kernel process. Simply take the KPID found in the ID Thread row of perfmon #2 and check task manager on the enterprise server the UBE or BSFN is running on. Sort the list by PID and look for the ID Thread (KPID) number. You can also find the server log for this process in C:\JDEdwardsOneWorld\ddp\B7334\Log\jde_nnnn.log where nnnn is the KPID number.
In the above, if the KPID is a OneWorld batch job, you can go to Work With Submitted Jobs and enter the KPID in the Process ID field to determine which UBE is causing the issue.
If the above is a JAS-related process one can try to find the KPID in the Host-PID field of the User List of SAW WEB