Selecting a Troubleshooting Approach
Selecting the most effective troubleshooting approach to solve a network problem allows you to resolve the problem in a quicker, more cost-effective manner. To select an effective troubleshooting approach, you must do the following:
Determine the scope of the problem.
Apply your experience.
Analyze the symptoms.
Determining the scope of the problem means selecting the troubleshooting approach based on the perceived complexity of the problem. A bottom-up approach typically works better for complex problems. A top-down approach is typically best for simpler problems. Using a bottom-up approach for a simple problem might be wasteful and inefficient. Typically when users report symptoms, you should use a top-down approach because of the likelihood that the problem is upper-layer related. If symptoms come from the network (such as through an SNMP trap, error log, or alarm), using a bottom-up approach will likely be more effective.
Applying your experience means that if you have troubleshot a particular problem (or a similar problem) previously, you might know of a way or a shortcut to expedite the troubleshooting process. If you are less experienced, you likely will implement a bottom-up approach regardless of the circumstances. In contrast, if you are skilled at troubleshooting, you might be able to get a head start by beginning at a different layer using the divide-and-conquer approach.
Analyzing the symptoms allows you to have a better chance of solving a problem if you know more about it. At times, you can immediately correct a problem simply by analyzing the symptoms and swiftly recognizing the culprit.
To make an example for the topic of selecting a troubleshooting approach, assume that you have identified two IP routers in your network that have connectivity but are not exchanging routing information. Before you attempt to solve the problem, select a troubleshooting approach. You have seen similar symptoms previously, which point to a likely protocol issue. Because connectivity exists between the routers, you know that it is not likely a problem at the physical or data link layers. Based on this knowledge and your past experience, you decide to use the divide-and-conquer approach, and you begin testing the TCP/IP-related functions at the network layer. Having chosen to start at the network layer, you decide to ping one router from the router on the other side. If the ping is fully successful, then the problem could be due to restrictive access lists or mismatched settings between the routing protocols at the opposite ends. Therefore, it is apparent that with the divide-and-conquer approach and utilizing your experience, you have arrived near the problem (and hopefully its solution) quickly. Now, again using your knowledge and expertise, you can analyze the symptoms and hopefully identify the culprit.