The Network’s Slow Today

Don’t fix it if it ain’t broke” is an often-heard troubleshooting maxim. Although users may report that a modern data network is faulty, there’s a good chance it isn’t broken in the traditional sense, just degraded. Nevertheless, if it isn’t working correctly, it’s the network manager’s job to solve the problem.

The cause of network problems can be found more quickly by following a systematic approach. Network Troubleshooting by Othmar Kyas presents a logical progression upward from the physical layer. This book, published by Agilent Technologies, is the source for much of the material presented here about wiring and built-in monitoring.

The assumption is made that the network worked well in the recent past and something has happened to degrade its performance. To find the problem, the troubleshooting technician needs to know what has changed since the network last worked properly.

The cause of the problem could be as simple as a broken wire or unplugged connector. Because open or short circuits usually are easy to locate, eliminating physical-layer faults is a good place to start troubleshooting. The types of questions a technician might ask are summarized here.

  • Has anyone connected or disconnected a PC or any other component from the network?
  • Has anyone installed an interface card in a computer?
  • Has anyone stepped on a cable?
  • Has any maintenance work been performed in the building recently, for example, by a telephone company or building maintenance personnel?
  • Has any equipment or furniture been moved?

It doesn’t take long to ask enough questions to determine if the likely source of the problem is in the physical layer. Instruments such as digital multimeters, cable testers, oscilloscopes, time-domain reflectometers, and spectrum analyzers are all effective in determining the root cause of wiring-related signal impairments.

Built-In Monitoring

If the fault appears to be in a higher protocol layer, relevant information may be available from the strategic monitoring systems built into the network. The simple network management protocol (SNMP) is used by the network manager to communicate with SNMP software agents embedded within the network’s switches and routers. The SNMP agents interact with standardized objects in the management information base (MIB), the virtual database of the remote monitoring (RMON) system.

Part of the data available from the RMON MIB includes descriptions of the network configuration, interfaces, and protocol implementations. This information forms a current view of network extent and capabilities, but not its dynamic operation. Several well-defined RMON data groups that relate to Open Systems Interconnection (OSI) Layer 2 analysis of present and past activity in Ethernet networks fill that role:

  • Statistics—List of typical Ethernet statistics such as multicasts, fragments, and collisions.
  • History—Storage of Ethernet statistics over a defined period.
  • Alarm—Definition of alarm thresholds and trigger events.
  • Hosts—Collection of statistics sorted by network host.
  • Host TopN—List of top n hosts, sorted by selected criteria, such as top five talkers.
  • Matrix—Record of communications relationships between network nodes.
  • Filter—Filtering of data packets by selected criteria.
  • Capture—Recording of data packets for later analysis.
  • Event—Definition of actions that can be triggered by parameters in other MIB groups.

The more recent RMON 2 specification defined a number of similar groups associated with OSI Layer 3 and supported monitoring of all local area network (LAN) and wide area network (WAN) activity regardless of network topology. Further refinements such as SMON, a version of RMON to deal with switched networks, have been introduced to ensure the effectiveness of remote monitoring as network architectures continue to evolve.

The usefulness of RMON data in a troubleshooting situation depends very much on how the RMON capabilities have been implemented. A separate probe with dedicated hardware may be capable of monitoring and recording network activity at the full line rate.

This contrasts with RMON capabilities provided via built-in software applications in routers and switches. Integrated software RMON agents may cope with the average traffic level, but in heavy traffic conditions, drop packets may be deactivated automatically.

Protocol Analysis

The only tool guaranteed to provide a complete picture of network activity and packet timing is a hardware-based protocol analyzer. Low-cost portable analyzers consisting largely of control and analysis software, a PC, and a standard network interface card (NIC) may not be robust tools in difficult applications.

“As the processing speed of PCs increases, it may seem logical to…make use of the latest version of a notebook PC with appropriate software,” said Thomas Heim, a product marketing manager at the Agilent Technologies Communications Test Equipment Division. “This certainly sounds correct but is not true when it comes to critical performance and response time problems.

“We have found with many of our customers that a standard PC will not capture and analyze data in real time, especially when the user is interested in application information,” he continued. “The problem for the PC is not dealing with the average frame rate per second but rather with a burst of frames in a fraction of a second that will overload the capture process and result in wrong expert analysis information.”

There is a medium-speed solution that may be a good choice for some networks. Special NICs capable of much faster performance than a standard NIC are available from test-instrument manufacturers. As an example of a protocol-analyzer application working with such a card, Acterna’s LinkView® Classic Network Analyzer Software runs on a PC with the proprietary Acterna Advanced Ethernet Adapter Interface Card. This combination captures all frames and error packets in 10- and 100-Mb/s systems.

In contrast, a powerful hardware-based protocol analyzer can capture full-speed network traffic even for Gigabit and 10-Gb Ethernet WANs. Built-in filtering capabilities support real-time selection of the type of data to be acquired, significantly reducing the volume of data to be stored and analyzed. For example, Figure 1 shows network traffic analyzed by destination address.

Capturing all the required data with accurate packet time stamping is more than just a desirable feature. “[The lack of this capability] is a major drawback to software-based analyzers, because real-time voice and video applications are gaining momentum at small and medium-size enterprises,” commented Ronen Ben-Yossef, vice president of products at RADCOM. “While Voice over Internet Protocol (VoIP) implementation is a major reason why IT managers acquire an analyzer, without accurate time stamping, jitter analysis is impossible. A stand-alone analyzer with an internal clock can provide delay, jitter, and voice-quality assessment.”

However, modern network architecture has limited the parts of the network that any one analyzer can address. The introduction of network switches caused an immediate problem for protocol-analysis vendors and users. Previously, analyzers could see all traffic and detect network errors as they occurred. In a switched environment, this is no longer the case. Connecting a protocol analyzer to a normal switch port only allows the user to see broadcast and local traffic.

Analysis ports, also called mirror or span ports, have been provided by network equipment suppliers to alleviate this problem. To a degree, they have been successful. A span port can be set up so that a protocol analyzer plugged into the port can view the traffic from one or many switch ports. Span ports, however, do not forward error packets, and the timing between packets will be changed from what it was in the actual network.

A recent solution taps into the network at a point where all the traffic is available. Passive in-line taps integrated into the trunk lines carrying traffic between routers and switches provide access to the aggregated data stream. All the data is available here but transferring at the highest rates. Only special high-speed equipment can be used in these applications because 100% of the full duplex traffic must be captured. Filtering is essential at these speeds to make the best use of available memory.

The usefulness of the different types of protocol analyzers becomes clear if you consider the trend toward a flatter core network with more intelligence and service provisioning occurring at the network’s edge. Wayne Newitts, marketing manager at the Network Diagnostics Division of Tektronix, said, “As you move toward the edge of the network, the number of discrete data and signaling links dramatically increases. It becomes increasingly difficult and inefficient to attempt to monitor all of these links. And, it is precisely there, at the high link-density network edges, where trouble is likely to occur.

“Therefore, the utility of stand-alone, portable protocol analyzers actually is increased,” he explained. “This doesn’t mean that centralized, distributed systems are unimportant in network management, but stand-alone portable and powerful test tools are necessary to troubleshoot and drill down to the root causes of problems. Remote monitoring solutions can identify fault conditions, but drilling down through the various protocol layers and solving the problem still require field-test capability.”

Because of the complexity of modern networks and their problems, the best troubleshooting procedures use all the information available. Robert Finlay, the product manager for protocol analyzers at Fluke Networks, said, “When RMON is used for troubleshooting instead of long-term monitoring, fast access to the RMON data is important. The network engineer needs to be able to poll the RMON data source frequently to get the fine changes in data necessary for troubleshooting.

“Accessing historical reports will be helpful in understanding what the baseline for that network is,” he continued, “but it usually won’t help determine the cause of the problem. RMON is a helpful tool in the network engineer’s tool bag, but it doesn’t replace the protocol analyzer.”

While fundamentally agreeing about the role of the hardware protocol analyzer, Agilent’s Mr. Heim emphasized the addition of RMON capability. “A state-of-the-art analyzer is capable of reading SNMP/RMON data from the RMON agents at the same time it is analyzing local data while connected to a switch. Having access to both types of data means that it can correlate the information and display the result in one application. We tell our customers that RMON complements the troubleshooting process, but it does not provide great troubleshooting capabilities on its own.”

Conclusions

“Protocol analysis is the last resort when troubleshooting network performance issues,” said Fluke’s Mr. Finlay. “This is due to the complexities of analyzing huge amounts of data delivered on the high-speed networks of today—up to 1,000,000 packets/s on a Gigabit Ethernet network. However, once all other analysis resources have been exhausted, protocol analysis will reveal the secrets of network performance optimization and degradation that are hidden in the packets.”

A great deal of the troubleshooting guesswork has been eliminated by including expert analysis software in modern analyzers. For example, the Fluke OptiView Protocol Expert Software and OptiView Link Analyzer hardware help network engineers isolate problems that are captured or monitored in real time. These products support VoIP quality of service (QoS) analysis and can be operated remotely.

To improve the effectiveness of an enterprise’s few expert network engineers, Agilent recently introduced the Network Troubleshooting Center (NTC). This central software application controls remote agents and accepts data from them.

The agents can be RMON probes or software or hardware network analyzers. In all cases, the acquired data is made available to a centrally located group of experts and presented in a uniform and correlated manner. Depending on how the resources have been deployed, NTC can provide an overview and a drill-down capability.

Looking forward, RADCOM’s Mr. Ben-Yossef postulated improved network elements that would aid troubleshooting. “Monitoring ports were a significant first step in addressing the blind spots inherent in switched networks. But, lack of trained network technicians and confusing management software have limited their effectiveness. An ideal solution would embed a protocol analyzer chip in the switch. This would give the network manager full visibility of the traffic traversing the switch.”

Finally, it’s important to remember that few networks operate in a vacuum. Mr. Newitts of Tektronix added, “Networks interconnect and must be interoperable. Self-diagnosis typically does not address all the issues around network interoperability.

“As the network topology develops, mobile clients will be moving between personal area networks (PANs), such as Bluetooth and ultra wide bandwidth systems (UWB), LANs, and WANs including enhanced data for GSM evolution (EDGE) and CDMA2000. The interoperability issues will span not just different networks, but also different types of networks,” he observed. “The need for independent, standards-based portable and powerful stand-alone signaling analysis will be greater than ever.”

Return to EE Home Page

Published by EE-Evaluation Engineering
All contents © 2003 Nelson Publishing Inc.
No reprint, distribution, or reuse in any medium is permitted
without the express written consent of the publisher.

June 2003

Sponsored Recommendations

Comments

To join the conversation, and become an exclusive member of Electronic Design, create an account today!