Windows Server Troubleshooting - Methodology

Click here to start saving with ING DIRECT!

Home | Methodology | Architecture | Tools | Memory | Processor | Registry | File System | Network | Active Directory | Contents

Get the Book

Major Topics
Home
Other Topics
Methodology
Architecture
Tools
Memory
Processor
Registry
File System
Network
Active Directory
Contents
More Detail
Baseline
Analysis

eXpert Genealogy

Memory from Crucial.com


2003-2006 Team Approach Limited
All rights reserved


 Troubleshoot What?

Computer networks have huge numbers of hardware and software components. To troubleshoot Windows networks, you first must look at the big picture.

Determine the Scope of the Problem

You need to initially take a holistic view and identify the scope of the problem. Is the problem on all computers, one computer, or a group of computers with some common characteristic? Is the common characteristic a switch, a hub, or a server that they access? Once the major area is identified, you can continue to focus in on the detail that is the cause of the problem.

OR 

Check the Obvious

 Initially, it is important to not overlook obvious mistakes.  Do all devices have power? Are all of the network cables plugged in? Have all error messages and error logs been viewed and noted? Checking the obvious takes little time and can avoid unnecessary investigations.

Divide and Conquer

If you do not attack your problems with a systematic methodology, you will quickly get overwhelmed by the numerous possible causes of the problem. The strategy that you must adopt is a divide and conquer approach. Devise a series of tests that will divide the numerous components into two groups where you know that the problem is in one group and the other group is problem free. Repeat this approach on the group of components with the problem until you identify the individual component with the problem.

This methodology can be applied to all sorts of troubleshooting, but computer networks are unique in that they are composed of huge numbers of hardware and software components. Computer network troubleshooting uniquely has different types of fault isolation.

  1. First determine which node in your network has the problem. I call this first step node isolation.
  2. Second determine if more than one protocol stack has the problem. If you use both TCP/IP and NWLink/IPX, determin if the problem is with Internet access, Netware server access, or both. I call this second step stack isolation.
  3. Once you have identified the protocol stack with the problem, you must determine which layer has the problem. Is the problem in the network card, IP, TCP, or higher level service. I call this last step layer isolation.

Use an appropriate tool, to perform your fault isolation.

Problem Isolation Type Example of test
? Node Isolation TraceRt shows you the path through an internet and stops at the node with the problem.
TCP/IP or IPX/SPX Stack Isolation Test each protocol stack by trying an application that uses that protocol.
HTTP
TCP
IP
NDIS
Layer Isolation A failed PING test tells you that the problem is in the Network layer or lower. A successful PING indicates that the problem is in the Transport layer or above.

Types of Fault Isolation

    Layer

Stack  

Isolation  
    Isolation      
           
   

Node Isolation

TCP/IP SPX/IPX  
           
LAN            

Art or Science

Computers networks are engineered systems that can be analyzed and fixed using systematic scientific methodologies. Because of their dynamic changes and complexity, you can save significant time and effort by utilizing your intuition and unique knowledge of your environment.  

Troubleshooting Steps

Troubleshooting any problem leads to the steps in the common sense flowchart on the first page. The challenge is to conceive of a hypothesis and test that will further isolate the source of the problem. For example, if your web browser will not communicate with a web server, try to communicate with the PING command to determine if the problem is

  • an IP communications problem or
  • a higher level TCP or HTTP web problem.

Try to investigate problems in a test environment rather than your production network to avoid user interruptions.

Long-Term Goals

Continuously strive to learn about Windows, network components, protocols, computer hardware, etc. The more you know, the better you will be at troubleshooting. Read the manuals. Learn about the troubleshooting tools presented here and develop your own methodology and toolkit.

These icons represent utilities that are described in the Tools Section