Staying on Top of Packet-Switched Networks

Sept. 1, 2000

12 min read

Congratulations! You just landed a new job as CEO of a regional fiber-optic communications network. The previous incumbent claimed stress-related health problems and opted for early retirement at age 42.

He had successfully guided the upgrade to OC-12 asynchronous transfer mode (622.08 Mb/s ATM) over synchronous optical network (SONET), and the program to implement digital subscriber line (xDSL) services was on target. However, he hadn’t coped well with the growing level of user defection due to poor network availability and quality of service (QoS).

You were selected by the board of directors because of the insight apparent in questions you asked during your interviews. These included “Why hasn’t the test budget been increased in proportion to the new network equipment?” or “You know, you can’t expect just to monitor network activity and bill customers.” or “How do you plan to achieve the QoS level required to retain present accounts and attract new ones?”

Test equipment may not have seemed an obvious answer to the expert financial, sales, and marketing board members. However, higher traffic speed and density and much more emphasis on data than on voice traffic had brought greater network complexity. It was no longer possible to rely only on the operational support tools supplied by the network equipment manufacturers. Sure, these tools gave some diagnostic information, but what actually was needed was flexibility to monitor network QoS and perform detailed troubleshooting across multiple network elements and protocols.

Networks

It’s not surprising that the network operator encountered problems as a result of the upgrade. “The migration of services to an ATM- or Internet Protocol (IP)-based network is more than a simple transition,” said Joe Luby, Ameritech’s general manager for network reliability and security. “It will require at least the same levels of network monitoring capabilities as we have today, and probably even enhanced levels, to provide customers the same level of service we supply currently.”¹

The growth of packet-switched networks (PSNs) within wide area data/voice networks (WANs) represents a large increase in complexity. Packet-switched WANs come in two flavors. ATM networks carry variable-length IP packets and other real-time data transported inside fixed-size ATM cells of data within SONET framing. Packet over SONET (POS) networks, on the other hand, carry IP packets within SONET framing directly. The addressing and error control coding is contained in the packet header. Also, data is classified according to type.

For simple data, the order of packets and their exact timing aren’t important. The original information can be reassembled correctly by examining the packet sequence numbers after they have been received. Why wouldn’t the packet order received be the same as was sent? Simply because PSNs route packets dynamically according to available capacity, required QoS, and the class of signal. Some of the packets may take a different route than others, so they can arrive out of order. ATM cells, on the other hand, don’t have sequence numbers and must be received in their exact transmitted sequence and follow a fixed route.

For real-time signals such as TV and audio, the latency associated with packetizing, depacketizing, and the various network elements through which the packets must pass can be critical. Consequently, these types of signals use different adaptation layers (AALs) within the overall ATM protocol that handle real-time signal differently than data packets.

To ensure that their critical signals are not corrupted, individual customers such as large companies contract network service based on their amounts and types of data through a service level agreement (SLA). The network operator guarantees the level of service needed and must verify that the network actually is delivering it, hence the interest in well-defined testing and monitoring procedures.

In addition, the companies are limited to no more than their contracted level of traffic. Within PSNs, switches and routers have finite capacity: there is only a given amount of bandwidth available. If many users were to disregard their SLAs and simultaneously transmit large amounts of data at high rates, the network would become overloaded, and QoS would decline.

ATM policing is one mechanism that protects network users from overload. All customers’ data rates and classes are monitored. If any user transmits at a rate higher than that company’s SLA allows, some of its packets are dropped to reduce the rate. In this way, other users operating within their contract limits are unaffected, and their data is handled normally.

PSNs and the intricacies of ATM are very different from the operation of circuit-switched networks (CSNs). The prime example of a CSN is the traditional plain old telephone system (POTS). In this system, traffic is steered by separate control signaling. Once the connections have been set up, the data or voice simply travels between the two ends, and only one source and one receiver can use the circuit. In contrast, many sources and receivers can share a single channel in a PSN.

Monitoring vs. Testing

A large number of statistics must be accumulated as evidence of a PSN’s state of health. The added complexity of ATM and the types of traffic it carries only increase the variables that can be measured; for example, lost packet count, errored packet percentage, lost ATM cell count, errored cell percentage, cell transmit delay, cell and address delay variation (jitter), traffic split by protocol, traffic split by AAL. But because these measures simply indicate the network’s present behavior, they just hint at underlying problems.
“We’ve found that broadband network operators generally understand that they have to build test equipment into the network,” said Mike Gouveia, Spirent Communications’ vice president of strategic marketing, Adtech. “The main reason is because broadband networks are much more difficult to operate than traditional circuit-switched networks. Performance is not a given because it depends on how things are set up and what kinds of traffic loads are being offered. In any of the packet-switched or cell-switched ATM sites, for example, any user’s traffic has the potential to impact other user traffic.”

Centralizing testing and monitoring services is a trend that has progressed in step with the build-out of PSNs. It makes more sense to concentrate a few highly trained network technicians and managers in a central location rather than attempt to solve problems locally as was previously done in CSNs.

Increased complexity has necessitated more thorough training. At the same time, remote monitoring and better network tools have made centralized troubleshooting practical. Finally, the practice has proven to be economically more efficient.

In Adtech’s approach, a test head is located with each network switch. The test heads contain separately controllable traffic generators and analyzers, and all test heads can be controlled either directly or from a remote central location. Because the test equipment is distributed throughout the network, testing takes place locally, eliminating the data backhaul associated with centrally located test equipment.

A built-in test and monitoring system has the capability to continuously monitor the network to find faults as they develop. The types of tests that are possible depend on the mix of test equipment specified for each test head.

Of course, early PSNs are not as well equipped as more recently built or upgraded networks, and sufficient test capabilities may not have been provisioned. Doyle Mills, senior technical support engineer at Digital Lightwave, commented, “We often talk with operators about our equipment’s capability to monitor and test their networks, and they say that it is already built-in. But, they’re missing part of the solution.”

Mr. Mills compared the confidence a test engineer gains through the use of proven, separate test equipment to the less satisfying scenario of having the network test itself. In addition, when more traditional built-in monitor-only equipment is installed in a PSN network, it is not capable of injecting test signals into the network, which limits the kinds of tests that can be performed. “Mostly, people are looking at results of the error-monitoring system built into the various protocols,” he concluded.

Another approach to distributed monitoring determines application performance locally. Agilent Technologies’ Firehunter software uses distributed agents to access application programs as a user would do. For example, the execution of the application is monitored to identify slow response indicative of network congestion. Application-layer information can be correlated with lower-level data to help pinpoint the problem.

“We look at the service as it actually operates right now and use its performance as perceived by the customer to trigger the need to look into problems. We have our own agents that run our tests, but we also have been working with Cisco,” said Dave Colton, the custom engineering manager in the Agilent Technologies’ Firehunter group.

“Within their routers, Cisco has included a service assurance agent that can run tests directly from the router to look at network and application performance,” he continued. “The Cisco tests allow us to provide jitter and other types of testing that are critical for voice over IP and multimedia types of applications. And because their routers are so popular, we can push our agents very close to the customer’s site.”

An alternative to integrating test equipment into a network is to gather information from remote locations using portable protocol analyzers. If the analyzers can be synchronized, the errored data can be acquired before and after passing through suspect network elements. Nevertheless, reassembling this data into an overall picture of network activity and deducing from it the probable cause of the error can be difficult and time-consuming.

Some protocol analyzers claim real-time operation, but they usually accomplish only real-time filtering. This means that they cannot analyze data on the fly, but capture only the data of interest. The Sniffer Technologies division of Network Associates claims that the Sniffer instrument goes beyond protocol analysis and actually addresses real-time monitoring across all seven layers of the protocol stack.

Recently, Sniffer Technologies and Digital Lightwave entered into an agreement that makes the local area network (LAN)/WAN interface more accessible for test purposes. According to Digital Lightwave’s Bill Rooney, “Sniffer Technologies is very strong in the LAN area, but the company doesn’t have a way to tap into the backbone. That’s what we provide. We can access a particular dense wavelength division multiplexed (DWDM) channel and do a number of things with it. But, when it gets to higher level protocol analysis, it then can be routed to the Sniffer.”

How useful these or other test tools may be depends on what you are trying to do. Bahaa Moukadam, senior director of product marketing at Netcom Systems, manufacturer of the SmartBits™ range of test equipment, differentiated among troubleshooting, monitoring, and performance testing.

“If you have a big core switch or a big router and you want to test how well these perform under stress, then you could send a lot of traffic through all the ports of the device under test and gradually increase the traffic to find out where the breaking point is,” he said. “With SmartBits, you drive the simulated Internet traffic through the device or system under test so you can simulate how it will behave in the real world when it’s deployed on the Internet.”

Among the types of tests supported are throughput, packet loss or incorrectness, and latency. Traffic can be generated on up to 80 ports simultaneously, and the received data can be monitored in real time. Because the test hardware is guaranteed to operate faster than the relevant network specifications, you can establish a safety margin above the Internet performance actually required. However, these stress-testing capabilities are used more often in a lab environment or during system provisioning rather than in a live network.

A distributed, remotely controlled test and monitoring system also can streamline the off-line commissioning of new equipment prior to turn-up by automating the many repetitive and detailed tests. Once the new network or network addition is running correctly, Adtech’s Mr. Gouveia explained, “Network operators keep an eye on network performance on a continuous basis by sending a very low level of test traffic along the same route as the customer traffic to determine the QoS for that traffic. If it crosses a predetermined threshold, an alarm triggers, and the operator can find and resolve the problem before the customer even notices it.”

Conclusion

Protocol analyzers, DWDM demulti-plexing equipment with packet analysis capabilities, and test traffic generators are among the types of equipment you may need to help you keep your packet-switched network operating efficiently. But it’s not that simple.

Wavetek Wandel Goltermann’s (WWG) Jack Krupicka, senior director of strategic planning, said, “It makes a difference whether you’re talking about SDH and SONET telecom systems or IP-layer data traffic. Our typical customers separate these applications in terms of the ways they organize themselves.

“Whether they have telecom or data network background makes a tremendous difference,” he continued. “It’s a case of people having different skill sets and experience, so the test-equipment solutions we recommend are distinct for each group.”

The recent merger of Telecommunications Techniques Corp. (TTC) and WWG broadens Mr. Krupicka’s choices. TTC’s communications network analysis instruments, such as the T-BERD® and FIREBERD®, have been added to WWG’s existing telecom and enterprise WAN/LAN analyzer product lines.

Because the technology available to PSNs is changing so quickly, new opportunities exist for test-equipment manufacturers, especially when addressing the problems of heterogeneous networks. George Pubanz, a member of the technical staff in the Tektronix communications business unit, commented that many networks incorporate equipment from different vendors. In this situation, it may not be practical to install extensive built-in test capabilities.

“Clearly, the operator needs enough information to know that service is suffering and what type of degradation is occurring. You want to know what equipment to have on the truck,” he continued, “and the kinds of skills the technician must have to solve the problem on the first try.”

Reference

Agilent Technologies, “Telecom 99: Agilent Introduces System to Help Network Operators Manage Converging Networks,” press release, Oct. 4, 1999.

Additional Reading

“Remote Distributed Testing,” a white paper by Adtech, March 1999.

Acknowledgements

The following companies contributed to this article:

Adtech	800-348-0080
Agilent Technologies	800-452-4844
Digital Lightwave	727-442-6677
Netcom Systems	818-676-2300
Tektronix	800-426-2200
Wavetek Wandel Goltermann	919-941-5730

Return to EE Home Page

Published by EE-Evaluation Engineering
All contents © 2000 Nelson Publishing Inc.
No reprint, distribution, or reuse in any medium is permitted
without the express written consent of the publisher.

September 2000