Windows Server Troubleshooting - Processor

Click here to start saving with ING DIRECT!

Home | Methodology | Architecture | Tools | Memory | Processor | Registry | File System | Network | Active Directory | Contents

Get the Book

Major Topics
Home
Other Topics
Methodology
Architecture
Tools
Memory
Processor
Registry
File System
Network
Active Directory
Contents
More Detail

eXpert Genealogy

Memory from Crucial.com


2003-2006 Team Approach Limited
All rights reserved


Recognizing a Processor Bottleneck

Processor bottlenecks occur when the processor is so busy that it cannot respond to requests quickly. Excessive processor activity can be identified by

  • a high rate of processor activity
  • a long, sustained processor queue is a more certain indicator.

As you monitor processor and related counters, you can recognize a developing bottleneck by the following conditions:

  • Processor\% Processor Time often exceeds 80 percent.
  • System\Processor Queue Length is often greater than 2 on a single-processor system.
  • Unusually high values appear for the Processor(_Total)\Interrupts/sec or System\Context Switches/sec counters.

Hardware or Software

Our methodology in troubleshooting is to continue to Divide and Conquer. If it has been determined that we have high processor utilization because of high values of the % Processor Time counter, we can continue our problem isolation strategy by asking the question

  • Is the high utilization cased by hardware or software?

The cause of a processor bottleneck can be determined by examining 2 additional counters.

  • High System calls/sec indicates high utilization caused by software
  • High Interrupts/sec indicates high utilization caused by hardware devices

Subsequently we can examine the processor utilization of each process to determine which processes are causing the high processor utilization.

Solutions for processor bottlenecks

If the high processor utilization is caused by faulty drivers or faulty application software, we need to get the software fixed or replaced. If the high processor activity is caused by the normal activity, we need to acquire upgraded hardware or reorganize our software schedule.

  • Upgrade the processor
  • Add more processors
  • Replace Programmed I/O PIO devices
    • Use SCSI or Ultra DMA instead of normal IDE
    • Use bus mastering devices
  • Distribute applications onto other servers
  • If the utilization is caused by software, schedule processor intensive tasks at less busy times such as the night
    • Schedule the tasks via the Control Panel

Multitasking

When a program is loaded in Windows, it becomes a running process. Each process has at least one thread of execution which is the path of programmed instructions. Windows supports multi-threaded application which allows a process to have concurrent operations; i.e. multitasking within the process. Windows assigns the use of the processor to individual threads and keeps track of state information for each thread.

Multitasking works best on computers with multiple processors. Windows directs threads to run on available processors. Windows tries to run available threads on the same processor that they used previously. This mechanism is call processor affinity and takes advantage of the contents of a processors cache; L1, L2, and L3 if available.

Processor Counters

The System, Processor, Process, and Thread objects contain counters that provide useful information about the work of your processor. Examine the following counters for details about computer processes.

Object Counter Description
Processor % Processor Time The percentage of time the processor was busy during the sampling interval.
Processor Interrupts/sec The average rate per second at which the processor handles interrupts from applications or hardware devices. High activity rates can indicate hardware problems.
System System calls/sec Software calls to the operating system service routines.
System Processor Queue Length An instantaneous count of threads that are in the processor queue. Values consistently above 2 indicate an overworked processor.
System Context Switches/sec The average rate per second at which context switches among threads on the computer. High activity rates can result from inefficient hardware or poorly designed applications. Compare these counters with Processor\% Privileged Time, Processor\% User Time, and Processor\% Interrupt Time.
Process % Privileged Time The percentage of time a process was running in privileged mode.
Process % Processor Time The percentage of time the processor was busy servicing a specific process.
Process % User Time The percentage of time a process was running in user mode.
Process Priority Base Windows 2000 schedules threads of a process to run according to their priority. Threads inherit base priority from their parent processes. The base priority level of the process can be: Idle, Normal, High, or Real Time.
Thread Thread State A numeric value indicating the execution state of the thread.
See the table below.
Thread Priority Base The base priority level 1-31 for the thread. Threads inherit base priority from their parent processes.
Thread Priority Current The current priority level of a thread. This level can vary during operation.
Thread Context Switches/sec The average rate per second at which the processor switches context among threads. A high rate can indicate that many threads are contending for processor time.
Thread % Privileged Time The percentage of time a thread was running in privileged mode.
Thread % User Time The percentage of time a thread was running in user mode.

 

Priorities

Every thread has a numeric priority. The OS scheduler dispatches the highest priority thread to an available processor. If there is more than one thread at the same priority they form a queue. There is a ready queue for each of the 32 priorities. Task Manager can be used to change a process's relative thread priority within a priority class. The numeric priorities are illustrated in the following table. System Monitor can be used to monitor a thread's priority.

 

Process Priority Classes With Relative Thread Priorities

Thread priorities Process priority classes
Real time High Normal Idle
Time critical 31 15 15 15
Highest 26 15 10 6
Above normal 25 14 9 5
Normal 24 13 8 4
Below normal 23 12 7 3
Lowest 22 11 6 2
Idle 16 1 1 1

Thread State Transitions

As threads are initialized they are placed in the appropriate priority queue in a Ready state. The thread at the front of the highest priority queue is placed in the Standby state.  When the processor becomes available, the operating system scheduler selects the Standby thread and puts it into the Running state. If the thread needs to wait for an I/O event, it will go into a Wait state. If it runs for its complete time-slice (quantum), it will be preempted and put back into its priority queue.

Thread State Counter Values

System Monitor can display the state of each thread using the following numeric values.

State Description
0 Initialized.
1 Ready. The thread is prepared to run on the next available processor.
2 Running.
3 Standby. The thread is about to use the processor.
4 Terminated.
5 Waiting. The thread is not ready to run, typically because another operation (for example, involving I/O) must finish before the thread can run.
6 Transition. The thread is not ready to run because it is waiting for a resource (such as code being paged in from disk).
7 Unknown. The thread is in an unknown state.

Interrupt  Request Level (IRQL)

The IRQL is the priority ranking of an interrupt. A processor has an IRQL setting that threads can raise or lower. Interrupts that occur at or below the processor's IRQL setting are masked and will not interfere with the current operation. Interrupts that occur above the processor's IRQL setting take precedence over the current operation.

The particular IRQL at which a piece of kernel-mode code executes determines its hardware priority. Kernel-mode code is always interruptible: an interrupt with a higher IRQL value can occur at any time, thereby causing another piece of kernel-mode code with the system-assigned higher IRQL to be run immediately on that processor. In other words, when a piece of code runs at a given IRQL, the Kernel masks off all interrupt vectors with a lesser or equal IRQL value on the microprocessor.

System Interrupt Request Levels

Highest = 31 Bus Error
30 Power Failure
29 Interprocessor Communication
28 System Timers
... I/O Device Interrupts (DIRQL)
2 Dispatch Level
1 APC Level
Lowest = 0 Passive Level

Some interrupts are more important, or higher priority, than others. For example, system clock timer interrupts are central to many of Windows functions including preemptive multitasking. As a result, timer interrupts are treated as having higher IRQL interrupt request level and therefore a higher priority than interrupts caused by peripheral devices such as printers and modems.

Note that thread dispatching is sometimes handled by sending a special interrupt called a DPC deferred procedure call. Both DPC's and APC asynchronous procedure calls are like bookmarks: they tell the processor to do the task requested when it has no higher-priority interrupts. Only when there are no interrupts at all does a thread actually execute.

IRQ Levels                                                    Logic for interrupt handling

High 31
Devices & timer
   
Power fail     interrupt    
Interprocessor notification  
Free any
waiting threads
 
Continue with current thread
Clock
Device 0           No
...  
Is higher priority thread ready?
No
Has quantum expired?
Device n  
DPC     Yes     Yes
APC
0
Preempt
running thread
 
Run next
thread
Passive

Interrupt Service Routines

When an interrupt occurs, it is serviced by an ISR interrupt service routine. Data structures called an IDT interrupt dispatch tables define which interrupt service routine will handle the interrupts occuring at each IRQL.

The kernel supplies the interrupt service routines for many system interrupts such as clock ticks, power failure, and thread dispatching. Other interrupt service routines are provided by the device drivers that manage peripheral hardware devices such as network adapters, keyboards, and disk drives.

Interrupt Masking

When an interrupt occurs at one IRQL, all interrupts at or below that IRQL are blocked or masked. If an interrupt occurs at a higher IRQL, then the higher priority interrupt is serviced immediately.

Keyboard Exercise

The Windows Resource Kit includes an example called CPUStres.exe that will provide an artificial processor load on your system. Examine the options of CPUStres and view the effects with System Monitor.