With its 2007 release of the DDR3 SDRAM standard, JEDEC promised dramatic performance improvements at reduced power. The key to gaining those benefits lies in a complex physical-layer (PHY) interface that incorporates automatic calibration of both timing and impedances. By understanding the main features of the DDR3 interface, designers will be well positioned to make good use of the interface design’s intellectual property (IP) that’s now available.
The promise of higher performance is easy to see. The DDR3 specification supports data rates of 800 to 1600 Mbits/s on each pin and device capacity as large as 8 Gbits, both almost doubling the DDR2 specifications. For its I/O voltage (VDDQ), DDR3 specifies 1.5-V operation—down from the 1.8-V operation of DDR2. This voltage reduction alone represents a 30% drop in power demand for the interface, but even greater savings are possible in some applications.
DDR3 offers more power-down modes than DDR2, so there’s an opportunity to reduce average power use through adaptive power control. Further, its timing specifications suit a variety of slew rates around the 1-V/ns nominal rate for signal transitions. This frees developers to slow down edges to reduce noise and power in exchange for lower timing margins and still meet the DDR3 qualification requirements.
But the DDR3 specification does more than decrease power and boost performance. Many of the changes from DDR2 to DDR3 work to simplify board design as well as reduce noise and improve timing (see the table). The specification provides a ZQ pin for on-chip calibration of I/O driver impedance, for instance, to reduce reflections while simplifying board layout compared to the off-chip calibration of DDR2.
Another facilitating feature is the Reset pin. In DDR2, the memory controller had to clear all memory registers individually to set the device to a known state. The DDR3 Reset pin will clear all internal states simultaneously. Some of the noisereduction practices include adaptive ondie termination (ODT) of signal lines for trace impedance matching and an increase in package pin count to accommodate more ground and power connections.
There are also some operational differences between DDR2 and DDR3. The prefetch length, for instance, jumps from four words in DDR2 to eight words in DDR3 to speed access to sequential addresses in memory. It also maintains support for the shorter data transfers by allowing early termination of the burst (chop to four). Another operational difference is elimination of the single-ended I/O option for data strobes. DDR3 timing interfaces are all differential.
Built-in Timing Calibration Some of the most significant differences between DDR2 and DDR3, however, revolve around the memory interface’s ability to automatically calibrate timing. Such calibration is in part required by a change in memory-card topology. It also has the side benefit of easing the burden on the board and memory controller designers to control timing by careful layout and matching of signal line lengths. The automatic calibration can accommodate such variations as well, along with those that come with the topology change.Understanding the calibration features in DDR3 begins with knowing basic timing. Like DDR2, the DDR3 memory interface is source synchronous. Each memory device generates a data strobe (DQS) along with the data (DQ) it sends out during a memory read operation. Similarly, the system must generate a DQS along with its DQ information when it writes to memory.
Typically, system memory interfaces divide their output data word into 8-bit lanes and provide a separate DQS for each lane. One difference exists between the read and write operations, though. The DQS generated by the system for a write operation has its edge centered in the data bit period, but the DQS supplied by the memory in a read operation is edgealigned with the data (Fig. 1).
Due to this edge alignment, the system memory interface must be able to alter the timing of the read DQS so it’s positioned to meet setup and hold requirements for the registers capturing the read data. And to improve timing margins, account for trace variations, and reduce simultaneous switching noise (SSN) in the system, the DDR3 memory interface must be able to alter a host of other timing parameters as well. If the system uses dual in-line memory modules (DIMMs), for instance, the interface needs to provide write leveling.
Write leveling is an adjustment of the signal timing that compensates for variations in signal travel time, which arise because the DDR3 standard calls for a different DIMM routing topology than for DDR2. In DDR2 DIMMs, the control and address signals follow balanced traces in a “T” or star configuration to ensure that the signals reach each device simultaneously.
But to minimize SSN, the DDR3 DIMMs follow a fly-by architecture, which results in the control signals reaching individual memory devices with staggered timing (Fig. 2). To ensure proper margins at the memory end, the interface’s write circuitry must match the control signal arrival timing by staggering the launch of DQ and DQS for each device.
The process by which a memory controller determines the correct delay for each DQS uses a signal coming from the memory device. Each SDRAM uses the DQS from the interface to sample the clock (CK), asynchronously feeding the sampled clock signal back to the controller on one or more data lines. To calibrate the write leveling adjustments, the memory controller must sweep the DQS for each data group through its delay range.
The controller adjusts the delay one step at a time until it detects a zero-one transition on the sampled clock signal. The delay step that achieves this transition results in the alignment of DQS and CK at the memory end. By knowing the delay step that achieves alignment, the controller can determine the delays that will position the launch of DQ and DQS as needed to meet the memory’s setup and hold requirements.
Bit Skew Compensation The write leveling process adjusts the memory-interface output timing to accommodate signal skew resulting from the fly-by topology of the address and control buses on a DIMM. It does not, however, compensate for skew between individual bits in a data group resulting from variations in board trace length and circuit speed in the interface. A separate delay control and calibration is needed for the interface to accommodate such skew. That bit-to-bit skew calibration can be incorporated within the read leveling process that handles the read signal skew from the fly-by topology.Continued on page 2
In read leveling, the interface must work with a known reference data pattern stored in memory. The DDR3 specification provides a mechanism for storing the reference pattern in a special multipurpose register (MPR). When the MPR is active, its signals replace the DQ, data mask (DM), and data strobe (DQS) signals that would otherwise come from the memory array. Writing to the MPR in preparation for read leveling can occur at a reduced transfer rate so uncalibrated timing isn’t a concern.
A typical mechanism for implementing read leveling can adjustably delay the DQS and individual DQ signals arriving from memory (Fig. 3). The memory interface then uses the delayed DQS to clock each double-rate DQ into two data bits at half speed so a second stage can capture and align with the system clock for backend processing. The goal of the leveling process is to select a delay value for DQS that maximizes the timing margins for the double-rate clocking step.
At the data rates supported by DDR3, though, simply adjusting DQS may not provide enough timing margin because skew among the individual bits in a data group can be major fractions of the timing period (Fig. 4a). Ideally, the interface’s leveling process would also be able to determine and implement delays for each data bit as well as the DQS that will center the data on DQS (Fig. 4b). This would maximize the timing margins for the double-rate clocking step. It would also ease layout by compensating for traveltime variations.
A calibration process that will achieve this centering calls for the memory controller to sweep the DQS delay through its range one step at a time. For each DQS delay step, the controller would capture a read data pattern and compare it to the expected pattern, noting which bits pass or fail at each step.
When the sweep is complete, the controller has information on the range of DQS values for which each bit line performs correctly. This gives the controller enough information to select a DQS delay value. By adjusting the delay on each data bit, the controller can then shift that bit’s operational timing range to center on the delayed DQS signal. This process maximizes the timing margins on all of the bits.
Tracking VT Variations The read and write leveling processes occur only once, during power-up initialization of the memory interface. During normal circuit operation, however, voltage and temperature (VT) variations can alter signal timing within the memory interface device by a significant fraction of the DDR3 data rate period. To keep timing robust, then, the memory interface should also compensate for these VT variations.One way to accomplish this is to have a test pathway in the interface that mimics the data pathways and calibrates that mimic path timing against a reference clock during initialization. By periodically checking the mimic path timing, the memory controller can determine the amount and direction of any timing shifts and apply an appropriate compensation to the memory interface signals.
Among the timing adjustments the memory interface must make is an option to control the impedance of circuits. The DDR3 specification provides a ZQ pin on the memory device as an attachment point for an impedance reference. The memory can use this reference to calibrate its own I/O impedances and present that information to the memory controller. The controller can then actively alter the memory’s ODT along with its own I/O impedances to maintain a good match among all of the circuit elements (Fig. 5).
The presence of all these calibrations and other dynamic adjustments in the memory interface—as well as the complexities of managing the activation of banks and precharge of rows, controlling memory mapping, and providing dynamic command sequencing to the memory array—makes the full design of a DDR3 memory interface a daunting task. For many developers, then, acquiring a DDR3 interface in the form of silicon IP will be the preferred path.
Such DDR3 interface IP is becoming increasingly available. Among FPGA providers, for instance, Altera and Xilinx offer reference designs that leverage their devices’ strengths to provide all of the required functionality. ASIC cores that chip designers can incorporate also are available from Rambus, Virage Logic, and others.
Design teams making an IP selection, though, will need to consider a variety of factors. One is the IP’s flexibility for adapting to specific system requirements. Flexibility may be important, for instance, in slew rate and output driver strength control, as is available in the Rambus and Virage Logic DDR3 IP. Signal slew rate directly impacts noise levels and power consumption as well as setting attainable performance limits. It may be useful in systems that don’t meed the highest performance in their memory access to be able to scale back clock speed and slew rate to save power and reduce noise.
IP Flexibility Is Key The ability to control drive strength permits developers to readily adapt the IP for the various load levels. A high drive allows the interface to maintain signal speeds for heavy loads while a lower drive helps minimize power demand with light loading. A wide range of options, such as the nine settings from 30- to 240-O drive strengths offered by Virage Logic, makes implementing the right strength for a specific installation easier.Above and beyond the technical specifications of the IP, however, developers should consider the business aspects of working with a supplier. A vendor’s ability to provide support can be essential to meeting market windows. The size and financial strength of the vendor can also be key in assuring customers that the vendor will survive to continue offering support.
And then there’s the issue of cost. For now, DDR3 memory devices and the interface to support them are more expensive than DDR2. But an increase in one aspect of a system design effort may generate savings in another. Looking carefully at the performance increase, power reduction, and board design simplification that DDR3 SDRAM can bring to a system may be enough to tip the balance and encourage a move up to DDR3.