The eagerly awaited revision 3.0 of the Universal Serial Bus (USB) specification offers 10 times the speed of USB 2.0 even as it maintains backward compatibility with USB 2.0 and USB 1.1 devices. Early adopters ramping up their USB 3.0 (known as SuperSpeed USB) developments are testing this next-generation peripheral interconnect, which offers 5-Gbit/s data rates over copper. Developers are grappling with sophisticated physical-layer (PHY) features, including high-speed signaling, dynamic equalization, and power management, all of which add complexity to both the development and the test effort.
USB 3.0, like other high-speed serial protocols, introduces challenges at both the physical and protocol layers. The protocol component takes the greater part of the specification and adds rigid power management states that are mandatory for SuperSpeed certification.
SuperSpeed USB shares many physical-layer characteristics with PCI Express 2.0, including 5-Gbit/s signaling; 8b/10b encoding carrying an embedded clock; and support for active power states. However, unlike PCI Express 2.0, USB 3.0 targets external device interconnects.
Mechanically, the USB 3.0 connector has been designed with backward compatibility to the USB 2.0/USB 1.1 connector. But attenuation on the 5-GHz serial lines is considerable, and receivers operate on very small margins. Support for wired cable lengths of up to 3 m presents more challenges at these speeds, calling for techniques that electrically differentiate SuperSpeed USB from PCI Express.
Similarities can also be drawn between USB 3.0 and Serial Attached SCSI 6G. These interfaces offer comparable data rates and cable lengths. While the point-to-point link handling is substantially different, SuperSpeed USB is designed with an emphasis on storage applications. Planned enhancements to the USB 3.0 mass-storage driver stack will add multiplexed data streams to boost throughput.
STATE MACHINES RULE LINKS
USB 3.0, like its predecessors, is designed to work asynchronously over a differential pair. While higher-layer transfers remain primarily host-driven events, both host and device depend on state machines to keep track of link status. Beyond the request acknowledgement sequence, USB 3.0 maintains logical state machines for everything from power management through stream protocols, hub management, and error recovery. The power- management state machine even gets an appendix to further detail its operation.
In total, the SuperSpeed spec adds around 20 new state machines to the USB control interface. USB 3.0 devices must track entry and exit from all logical states at the link layer while simultaneously managing flow control buffers and packet framing. Pre-silicon simulation provides one method for getting early test coverage for complex IP designs. Nonetheless, post-silicon testing using real-world devices remains an essential step on the road to production release.
Debugging state machines can be simplified by adding debug code or stubs that can expose the current state in some manner. Developers electing to work with third-party IP libraries may not have this option.
One solution is to use a protocol analyzer between the host and device that can unobtrusively monitor and track the state changes of all the state machines as messages pass from one end to the other. This requires no debug stubs or modification of the base IP at either end of the link. Monitoring state changes in this manner adds no delays, since event timings will be the same as the production device.
With rising communication speeds, several time-related challenges present themselves. This may be intuitive, but one of the major advantages of migration to asynchronous serial from synchronous parallel is that the time of flight for an edge isn’t as important. Therefore, the time it takes a signal to propagate down the wire should not matter. After all, the clock is embedded in the data, so they will arrive at the other end together.
Therefore, instead of using a massively redundant overhead to protect against occasional errors (USB 3.0 expects a biterror rate of less than 10-12 bits), a timeout is usually employed to cover the cases where a critical or double failure can’t be completely avoided. Consequently, the developer needs to have a method to identify the root cause of an unexpected state change (e.g., timeout causing a recovery).
Protocol analyzers are designed to capture and display an exact copy of the two-way communication between a host and device. By presenting complex data exchanges in an easy-to-digest format, protocol analyzers allow developers to verify all 200+ timeout references on each transaction considerably quicker than if they used an oscilloscope or logic analyzer trace.
Continue on Page 2
LINK SYNCHRONIZATION ISSUES
Establishing a link between a Super- Speed host and device operating at 5 GHz requires the receivers on both devices under test (DUTs) to extract clock and phase timing by locking to the electrical transitions as quickly as possible. USB 3.0 devices achieve this by repeating a series of special link-training symbols that enable PHY synchronization. The USB 3.0 specification allows for variations in the receiver’s abilities, as no two PHYs will lock at exactly the same rate.
USB 3.0 developers face a fundamental challenge when introducing protocol analyzers into these high-speed link synchronization sequences. To effectively debug link-up issues, the analysis system should capture and show precise timing information for each of the link-training state transitions.
This requires that the analyzer serializer- deserializer (SERDES) detects the electrical idle state and then achieve bit lock as quickly as possible after the DUTs enter the RX_EQ state. Any delay in lock on the signal can cause the analyzer to miss the synchronization “window.” If the protocol analyzer can’t synchronize with the TX/RX pair, the system will erroneously show “garbage” in the trace.
PHYS: CHICKEN OR EGG DILEMMA
It has become quite common for new serial technologies to leverage existing technologies to reduce development time and risk. The similarities between USB 3.0 and PCI Express 2.0 have allowed some early developers to use PCI Express PHYs to begin prototype testing. Although the data rate and symbol coding scheme is similar, the USB 3.0 specification has some specific enhancements for external device applications and special out-of-band signaling methods to maximize power savings.
In some cases, vendors will develop test chips that can be used on development platforms prior to the vendor integrating the analog block and digital block into a single ASIC. These early-stage, discrete PHYs allow PHY teams to begin characterization well ahead of availability of the production system-on-a-chip (SoC). Using early test PHYs involves some risk, as these prototypes may contain bugs or may be missing the full functionality of a production (SoC).
Still another alternative is to use programmable SERDES, configured for PCI Express 2.0 PHY characteristics. While this approach enables testing between early device prototypes, subtle differences in implementation can require careful tuning to the front end. It’s also possible that devices from different vendors will not reliably operate together. This method is often used to enable the digital group within a design team to begin internal testing without waiting for 3.0 PHYs to become available.
Development of analyzers and testers for new technologies like SuperSpeed USB is also hampered by the scarcity of PHYs. The developers of protocol analyzers are under considerable pressure to provide test equipment as soon as the chip vendors first power up a prototype. The analyzers themselves frequently incorporate actual PHYs in their design. Yet, there are limited PHY options available during early development. Production silicon can lag the test market by as much as 12 to 18 months.
In some cases, test PHYs will be incorporated in analysis equipment design. This approach also runs some risk that early-stage development PHYs may not be functionally complete. Interoperability problems in the protocol analyzer may surface, manifesting themselves as linksynchronization issues when testing with production silicon.
Fortunately for the USB 3.0 community, alternate analog front-end (AFE) probing schemes for 5-Gbit/s signaling have been developed that reduce the reliance on prototype PHYs. Using programmable SERDES designed for 5-Gbit/s PCI Express allows test vendors to deliver reliable test tools well ahead of the first availability of USB 3.0 silicon. Fine-grained controls are provided for these programmable PHYs, which lets the analyzers adapt to a variety of test setups (Fig. 1).
It’s anticipated that most USB 3.0 devices will use dynamic receiver equalization to overcome the signal loss that’s common when operating at 5-GHz frequencies. For SuperSpeed devices, the equalization is adaptive so the devices can calibrate the receiver for different cable lengths.
PHYs will accomplish this dynamic equalization by cycling through special “spectrally rich data patterns” during link training. Circuitry on the SuperSpeed device can adjust the receiver eye pattern to minimize the effects of dielectric loss and crosstalk (Fig. 2). The SERDES on the analyzer must also provide some capability to equalize 5-Gbit/s signaling to ensure link synchronization.
Analysis tools based on programmable SERDES hold an advantage in this case. They provide options for tuning preemphasis and differential voltage—among other settings—to ensure signal fidelity that ideally should exceed what’s found on the system under test.
Continue on Page 3
The USB 3.0 specification defines aggressive power-management strategies to extend battery life, reduce power consumption, and provide responsive devices. When a SuperSpeed device “wakes up” and turns its transmitters on to exit electrical idle (U1 transition to U0), synchronization must be re-established. If the upstream port doesn’t know that a device is going to reconnect, it’s unlikely that it will successfully meet the USB 3.0 timing constraints.
Therefore, an additional mechanism is provided for waking up the port from the quiescent state. The device issues a Low Frequency Periodic Signaling (LFPS) handshake to alert the upstream port and then moves through the required recovery and link-training states. The USB 3.0 specification currently defines a rigid exit latency of ~1 µs moving from U1 to U0 (operational) power states.
As with the initial link synchronization, the analyzer front end must also detect and capture each state during these frequent recovery sequences that are required when transitioning from power-save mode. Any delay in lock when exiting electrical idle outside this window will again cause the non-optimized analyzer to be unable to resynchronize with the link under test.
Higher levels of integration and complexity in chip design have led developers to first prototype complex designs using software emulators or FPGA-based development platforms. While most USB devices will be implemented as an SoC, these prototyping environments allow large portions of the digital logic or IP to be developed and tested at lower speed.
For bleeding-edge designs that outpace what can be implemented in a modern FPGA, there’s value in prototyping the design at less speed, perhaps quarter- or half-speed, to verify functionality. By capturing and analyzing SuperSpeed packets transmitted at user-defined clock frequencies, designers can verify MAC-layer logic before committing a design to silicon.
ERROR DETECTION AND RECOVERY
USB 2.0 used CRC5 and CRC16 checksum algorithms to verify data integrity at the packet layer. USB 3.0 adds a third 32-bit cyclic redundancy check (CRC) because of the larger supported data payloads. However, the polynomial used for the CRC5 is the same for USB 2.0 and USB 3.0, while the polynomials used for CRC16 and CRC32 checksums for USB 3.0 are new. Thus, CRC algorithms and circuits used in USB 2.0 can’t be directly re-employed for USB 3.0.
As with any new technology, there’s a risk of misreading the specification or not reaching the same consensus as the wider community. A common problem in USB 1.1 was the failure of some developers to acknowledge that the CRC didn’t follow the LSB rule defined near the start of the specification. Thus, the CRC was sent in reverse order to the requirement defined later in the specification, causing a situation in which devices could interoperate between themselves but failed when connected to devices from different vendors.
Protocol-aware tools provide a thirdparty interpretation of the specification to validate device behavior. Not only will such tools expose real bit errors, they also can reveal systemic misinterpretations of the specification.
Of course, the CRC allows the device to detect and retry frames that contain bit errors. Verifying whether the IP correctly recovers from CRC and other errors at the link layer is difficult to simulate using real devices. While an arbitrary waveform generator could be programmed to transmit some errors, the complexity of generating logical state-machine errors makes this a prohibitive task.
It has become standard practice to use protocol-aware exercisers for injecting errors and validating link-recovery behavior. An exerciser system will interface directly to the DUT over standard cabling and emulate real host or target behavior. To be effective, these exercisers must be able to establish link synchronization and direct the device to a specific logical state before injecting the error.
Users can create controlled test scenarios using a script-based higher-level language. With USB 3.0, however, a nearly continuous stream of link-layer handshakes (Idles, Skips, Header ACKs, etc.) makes it painstakingly difficult to construct device emulation behaviors using packet-level scripting.
Fortunately, a new generation of protocol- aware exercisers now automatically handles the low-latency handshakes to dramatically simplify test-script development. These exercisers feature a complete link-layer implementation that intelligently responds to logical state changes. These systems allow early adopters to start bring-up testing with USB 3.0 chips well before commercial hosts are available.
Continue on Page 4
The more common application for exercisers includes simulating simple bit errors by corrupting the CRC. With intelligent exercisers that can progress through multiple logical states, one can go further and test violations such as corrupting flow control or other link commands. Sending LCRD_A/B/D instead of LCRD_A/B/C/D and similar packet ordering errors should send the link into recovery.
Exercisers should also enable users to easily adjust the timing and frequency of errors to find boundary conditions that might not turn up during simulation. With 20 new state machines and the corresponding substate transitions, the difficulties of creating a comprehensive test plan for USB 3.0 become insurmountable without a protocol-aware exerciser system.
Another concern with testing USB 3.0 is the high data rate and its impact on memory management within analysis equipment. During data transfers, Super- Speed links can flood a typical 1-Gbyte memory space in less than two seconds. Even during link synchronization, a nearly continuous stream of idle and flow control symbols can rapidly consume available capture memory. Event triggering, considered a luxury in USB 2.0, becomes essential with USB 3.0. Snapshot recording or spooling techniques aren’t practical at 5 Gbits/s. Users need the ability to trigger on specific bus conditions or symbols to isolate events of interest.
The similarities between SuperSpeed and PCI Express 2.0 have allowed both silicon developers and test vendors to jumpstart their USB 3.0 development. With PCI Express 2.0 IP and expertise under their belt, several silicon design houses are expected to begin sampling USB 3.0 chip sets early in 2009.
Likewise, both electrical and protocollayer test vendors have leveraged their PCI Express 2.0 techniques to deliver stable tools well ahead of the mainstream market. Nowhere is this more evident than in the critical signal-locking performance of these early analyzers.
Custom circuitry leveraged from PCI Express 2.0 analyzers has been adapted to USB 3.0 testers to provide impressive signal fidelity. This allows the analyzer to sit in the data path and seamlessly recover from electrical idle and capture the linktraining sequence. All training parameters, including timing elements, are reported in the trace, as are individual bus and power state transitions.