Current trends in high-performance digital bus designs have pushed the synchronous data-acquisition speeds of high-end logic analyzers to their limits. Nevertheless, there remains a need to accurately capture state data at these extreme speeds. To accomplish that goal, today's digital designers must understand the challenges presented by high-performance bus designs. Among those challenges are the critical relationships between the setup and hold times of their device-under-test (DUT) and their logic analyzer, as well as the placement of the logic analyzer's sample position. This article examines the difficulties inherent in sampling high-performance buses with logic analyzers, and some strategies for addressing those difficulties, both manually and automatically.
Regardless of the approach used, one critical factor can't be ignored—the importance of setup and hold times. By definition, every synchronous digital circuit with both clock and data inputs will also have a setup-and-hold-time specification. As bus speeds continue to rise, it becomes increasingly important to understand exactly what this specification means in relation to the logic analyzer's ability to capture data when the DUT's signals are stable and not transitioning. This stable data region is referred to as the "data-valid window."
Before discussing these topics as they pertain to high-performance buses, it's a good idea to review the basic concepts at hand. The "setup time" is defined as the amount of time that the data must remain stable prior to a clock transition. It allows data at the inputs to become stable before being acted on by the gates. The "hold time" refers to the amount of time that the data must remain stable after a clock transition. It ensures that the gates will have enough time to act on the data. The sum of these two times is known as the "setup/hold window."
A logic analyzer operating in a synchronous sampling mode is no different from any synchronous circuit that it might be probing, as it too has a setup/hold window. The logic analyzer latches data appearing at its inputs on each active clock transition provided by the DUT, and this data must be stable for a specified time before and after the clock transition. To correctly sample data, the logic analyzer's setup/hold window must fit within the DUT's data-valid window (Fig. 1). Typically, the logic analyzer's sample position is in the center of its setup/hold window. Because the location of a data-valid window in relation to the bus clock may vary between different bus types, high-end logic analyzers let users adjust the position of their setup/hold window—relative to the sampling clock—in resolutions from 500 to 100 ps. This feature helps ensure an accurate measurement by placing the analyzer's setup/hold window and its sample position within the DUT's data-valid window.
When probing bus signals clocked at lower speeds (usually less than 200 MHz), the relationship between the DUT's setup/hold and data-valid windows and the logic analyzer's set-up/hold window and sample position normally aren't an issue. This is because the logic analyzer's setup/hold window is relatively small compared to the DUT's data-valid window. But today, logic-analyzer users frequently face the challenge of having to know where the boundaries of that data-valid window are in relation to the sampling clock edge. As bus speeds continue to increase, knowing the position of the data-valid window will determine a user's success at making logic-analyzer measurements.
High-Performance Bus Challenges
With computer and networking systems continuing to push performance limits to new heights, the current crop of bus designs runs at speeds that result in very narrow data-valid windows. In many cases, newer clocking schemes, such as double-edge and source-synchronous clocking, have replaced traditional clocking schemes. In fact, margins have tightened to the point that a single clock reference across all signals in a bus is often impractical. Some characteristics of today's high-performance memory buses and the challenges they present to traditional logic-analyzer usage include the following.
Double-edge clocking: Also known as double-data-rate clocking, double-edge clocking obtains twice the data-transfer rate by using each clock edge for data-transfer bursts to and from memory. The basic clock rate is given as the whole clock period. Setup and control transfers typically operate at this basic rate, although once set up, data transfers operate at twice the basic rate by using both clock edges (Fig. 2).
Timing margins for control transfers—that is, mode, address, etc.—are generous due to two factors. First, data transfers operate at twice the rate of control transfers. A logic analyzer able to run at the double-edge-clocked data rate is more than capable of tracking control transfers. Also, control transfers are unidirectional, flowing from the memory controller to the memories themselves.
This implies a certain amount of simplicity where the timing relationship between the clock and the control values is determined by a single chip in the system. Variation from chip to chip must be accounted for in system design, but timing in any one system is stable. Furthermore, unidirectional signals are easier to terminate electrically, providing cleaner signal swings to the receivers and to the logic analyzer.
Probing data transfers under double-edge clocking is much more challenging than control transfers, however. Before proceeding to an actual measurement, users must determine if the logic analyzer's synchronous clock speed can handle the data rate created by double-edge clocking. A bus described as a "200-MHz double-data-rate bus" is actually transferring data at 400 MHz. The analyzer must be configured to correctly process input data at this rate using both edges of the clock from the memory system. A logic analyzer with a maximum synchronous-sampling speed of 200 MHz can accept data values clocked-in every 5 ns. If a 200-MHz bus is double-edge clocked, then data arrives every 2.5 ns, and the analyzer must actually operate at 400 MHz, or double the basic clock rate of the bus.
If the logic analyzer can handle the specified data rate, the user is confronted with a second, more interesting, challenge: when is the data valid at the logic analyzer? Unfortunately, this isn't just a simple matter of reading the data sheet. Several other factors are important. For example, when using a double-edge-clocking scheme, receiver thresholds have a critical impact on the position of valid data with respect to the clock.
A small error in threshold will produce a relatively large error in clock duty cycle and a corresponding error in the apparent position of the valid data, with respect to the clock. Compounding this potential problem is the fact that the error is in one direction on the rising clock edge and in the opposite direction on the falling clock edge. Some systems employ differential clocking for this reason, but few use differential data signaling. The effect that threshold error has on signal receptions is shown in Figure 3.
To add to this problem, data transfers, unlike control transfers, can be driven by any chip on the bus. Because the reference clock is probed by the logic analyzer at only one location, the data will arrive at the logic analyzer (and the controller chip) at different times relative to the clock, depending upon which device is driving. This isn't a problem for the logic analyzer when the data window is relatively wide (like 5 ns). Minor arrival-time differences aren't significant compared to 5 ns. But when the data-valid window shrinks to 2 ns (as on a 400-MHz bus, for example), variations in arrival time significantly reduce the size of the data-valid window.
Source-synchronous clocking: Data arriving at the receiver at different times relative to the clock, depending on which device is driving the data, is a major problem with a single reference clock on a bus (Fig. 4). Elaborate clock-distribution trees might alleviate this effect, but they're complex to design into crowded systems.
An alternative approach is to assign responsibility for driving the clock to the same device that's driving the data. In this scheme, the memory controller uses the control portion of the bus to set up a data transfer. Once the transfer is defined, the controller releases the clock and data lines. These lines are then driven by the device assigned to drive the data, which might be the memory controller for data writes, or a memory device for data reads. To further simplify physical layout, the data bus is divided into sections (usually 4, 8, or 9 bits wide), each with its own clock. The clock for each group defines the timing for that group only. Skew requirements are very tight within a group but may be looser between groups.
This approach allows the data to arrive at the receiver tightly synchronized with the arrival of the clock at the receiver, regardless of which device is actually driving the data. The receiver is designed to use each of the reference strobes for only the data associated with that strobe, thereby improving margins.
But, this solution also introduces a very specific problem to the logic analyzer. There can now be eight or more clocks for the data, but the analyzer has only a few clock inputs. Therefore, the user needs to determine which clock to use and how to correctly sample the data associated with the other clocks. Although many clocks are associated with the data, the relationship between the clocks is tightly constrained. All of the clocks have exactly the same number of active edges, and even though the delay between any two clocks is relatively unknown, it's constant. This is because each clock is derived from the single reference clock, which has a delay that varies from chip to chip and may be a function of board layout, although it's essentially fixed for each clock path.
For any given clock, these factors are nearly constant (thermal and power-supply drift being the remaining factors). Aside from these skews, any one data clock is as good as another, provided that another means can compensate for the differing delays in each part of the bus. Because skew adjustment on each channel is a common capability of logic analyzers, any data clock can be made to work successfully. But any pick requires the user to accurately adjust the sample position for each logic-analyzer channel.
Logic-Analysis Setup Strategies
The high-performance bus designs discussed in this article present some unique challenges when trying to obtain accurate and reliable data samples using a logic analyzer. Users can overcome these problems by adjusting the sample position of their logic analyzer so that the sample is within the desired data-valid window. They can either employ the features commonly available in traditional high-end logic analyzers, or implement a new automated approach that's available in the most recent generation of logic analyzers.
For a manual approach to optimizing a sample position, traditional logic analyzers offer a feature similar to the one shown in Figure 5. Using the on-screen menu, the user can adjust the logic analyzer's setup/hold window relative to the sampling clock signal from the DUT. Adjustments can be made on a whole bus or on each individual signal. The most efficient way to dial-in the optimal settings is by first using an oscilloscope to determine the width of the data-valid window and its location relative to the clock. This information must be collected for each signal.
For instance, if the scope measurement indicates that the data-valid window is 3 ns wide and the clock edge occurs 1 ns into the data-valid window, the logic analyzer's setup/hold value could be set to (800-ps setup)/(1.7-ns hold). This setting would place the logic analyzer's setup/hold window and its sample position squarely in-side of the data-valid window (Fig. 6).
After defining all of the necessary settings, it's important to run a logic-analyzer measurement to verify that all desired data is being sampled correctly. Fine adjustments to the settings can be made iteratively as necessary. While the iterative nature of this process can be time consuming, often taking hours or even days for very complicated systems, it's typically a one-time process done at the initial stages of testing. It yields much more accurate and de-pendable measurement data. The analyzer's setup/hold settings can be saved and reloaded as necessary.
Obviously, the next evolutionary step is automating the sampling-point optimization. With this in mind, Agilent Technologies has introduced a new measurement capability in its highest-performance logic analyzers. A feature known as "Eye Finder" automates the measurement of stable data regions on digital buses and the placement of the logic analyzer's sample position on each channel. It's a specialized measurement that allows the logic analyzer to identify the size and position of data-valid windows for all of the buses and signals being probed in reference to the clock's edge.
To take advantage of this feature, the user must first probe the DUT and define the buses and signals in the logic analyzer's setup user interface, just as when making a standard measurement. With Eye-Finder run on the selected signals, the logic analyzer implements the delay lines available in its hardware to monitor for transitions on each probed signal for up to 2.5 million clock cycles.
Once it has identified the stable and transitioning regions around the clock edge for each signal, Eye Finder automatically adjusts the logic analyzer's sample position so that it's within a data-valid window. The results of an Eye-Finder measurement are presented in a visual display like the one shown in Figure 7. The user can see the position of the sample relative to the position of the clock edge, the skew between various signals, and the phase of the signal in which the sample takes place. The user can then make additional adjustments to the sample position on every individual channel as necessary.
This automated approach offers several advantages. One is that it performs its measurement using the same signals from the DUT that will be debugged by logic analysis. This provides optimal accuracy because the size of the data-valid windows and the placement of the sample positions are calculated based on the actual behavior of the signals under the same conditions as those in which the logic-analyzer measurement will be taken. Plus, Eye Finder can analyze the signals and place the sample positions in less than a minute, saving a significant amount of time in the debug process. Furthermore, Eye Finder runs repetitively over thousands of clock cycles to improve its accuracy. Finally, as with the manual adjustment of the logic analyzer's setup/hold window, the sample positions represented in Eye Finder can be moved in 100-ps increments to provide the amount of resolution required for fine adjustments within a data-valid window.
In the final analysis, high-performance bus designs will continue to present new and interesting challenges to logic analyzers. As with any other bus designs, the logic analyzer must be properly configured in order to accurately acquire the bus data. The current generation of high-performance buses makes it essential for users to adjust a logic analyzer's sample position through either a manual or an automated process. By using the setup functionality currently available in high-end logic analyzers, users will ensure that they're viewing accurate debug data.
Dave Sontag is a senior human factors engineer with Agilent Technologies in Santa Clara, Calif. He received a BEE from the University of Dayton, Ohio, and an MSE in human factors engineering from Wright State University, also in Dayton. Sontag can be reached via e-mail at [email protected]
Rick Nygaard is a research and development senior engineer with Agilent Technologies. He holds a BEE from the Georgia Institute of Technology, Atlanta. Nygaard can be reached via e-mail at rick_nygaard