Packet-Based Data-Transfer Bus Keeps Interface Simple And Data Rates High
As microprocessors get faster, designers are using wider, high-speed buses to transfer data quickly enough to prevent the CPU from stalling. Yet it's getting harder for CPU and system designers to deal with wider buses that must run at clock rates of 100 MHz and higher.
A packet-based data-transport scheme developed by Advanced Micro Devices Inc. of Austin, Texas, promises to solve many of these issues. Its data-transfer interface can range from 2 to 32 bits wide in each direction. Known as the Lightning Data Transport (LDT), the interface combines a point-to-point user-definable bus width with a packet-based data-transfer protocol. It also can handle linked streams.
Gabriele Sartori, director of strategic marketing at AMD, says that his company's designers opted to use two unidirectional point-to-point links instead of a single bidirectional bus. This solves the problem of reversing the data direction (bus turnaround) on the bus as data flows back and forth. Each one of these links can be configured as 2, 4, 8, 16, or 32 bits wide. The widths are independently matched to upstream and downstream bandwidth needs. If a wider link is connected to a narrower port, or vice versa, the width is negotiated at system initialization. Commands, addresses, and data all travel over the same link, so few additional lines are needed to manage the interface.
All transfers over the LDT link use differential signaling, similar to that used by low-voltage differential signaling interfaces. When clocked at 400 MHz, the interface permits a data transfer of 800 Mbits/s by using dual-edge clocking, and 1600 Mbits/s if the data clock is doubled to 800 MHz. For short-distance CPU-to-CPU connections such as in multiprocessor systems, a dual 32-bit datapath can deliver an aggregate bandwidth of 12.8 Gbytes/s, which is up to 96 times that of a PCI 32-bit/33-MHz interface. AMD suggests using the 800-Mbit/s data-rate option for I/O interfaces.
Each pin pair consists of an LDT driver on one side and a receiver on the other (Fig. 1). Two pins are used per bit, with pin pairs swinging in opposite directions. The differential voltage swing is 1.2 V ¡¾5%. To keep the interface affordable, the line impedance is designed for 50-¥Ø levels, letting designers use standard four-layer PC boards. Also, system designs can have trace lengths of up to 12 in. and still achieve 800-Mbit/s data transfers.
A complete LDT interface consists of the differential signal pairs, a clock line for every eight pairs, and several control lines. These control lines include a Power OK and a Reset LDT signal, as well as a pair of lines to indicate when a control packet is being sent. According to Sartori, an 8-bit data bus in each direction leads to a total LDT interface of just 55 pins, including 10 ground lines. That interface delivers 12 times the bandwidth possible on a PCI-32/33-MHz bus, but with fewer pins. An optional link power-down signal is available when the interface is used in mobile systems.
In a typical system configuration, the LDT interface can stem from a host bridge and propagate through a PCI-X bridge to a South bridge interface that can then talk to standard PC peripherals (Fig. 2). Packets sent over the interface use the standard plug-and-play device headers. The systems see a single LDT chain as one plug-and-play bus that makes LDT-to-PCI bridges look like PCI-to-PCI bridges. The interface works with old, present, and potential operating systems.
Packets transferred over the LDT interface are sent in multiples of 4 bytes. On LDT links narrower than 32 bits, adjacent bit times are used to concatenate the necessary bits to support the N ¡ê 4 transfers. Packets contain commands, addresses, or data. Data packets follow Write commands and Read responses and range in length from 4 to 64 bytes.
In a typical LDT transaction, a CPU I/O Read operation starts with the CPU initiating an LDT Read command that is passed through to the I/O device (Fig. 3). The I/O device then sends back the data packets, which are relayed up to the CPU. Similarly, for a Posted Master Write, data appears to be written through to the memory. Actually, though, the LDT host interface simultaneously passes a request to the LDT South bridge, which requests the data from the hard disk. That data will be relayed up to the memory controller and then be written to the RAM.
When packets are transferred over the LDT interface, they are sent using an asynchronous clock forwarding scheme, with one clock line for every 8 bits (in each direction). A dual 32-bit interface would have eight clock lines (four in each direction). A single control line distinguishes command packets (reads and writes) from data packets. Since all operations are controlled via packet-based messages, all signals are managed in-band. This eliminates all the sideband signals that are typically needed to control interface operations. Even interrupts are handled via messages rather than hardwired signals.
Multiple data streams can be sent over a single link since packets carry source and target IDs, making them easy to sort. Packets with the same ID are considered to be part of the same data stream, and up to 32 IDs can share the same link. All streams are sent to or from a host bridge and an LDT device. Peer-to-peer communications take place through a host bridge. Unless specified using the Fence and Flush commands, ordering of one stream does not affect another stream. Isochronous streams get the highest priority.
Basic commands start with a 6-bit type field and include functions such as Write, Read, Read Response, Fence (all posts and responses in a stream cannot pass it), and Flush (forces all posted commands to complete). Data packets ranging from 4 to 64 bytes long are sent in 4-byte multiples. Transfers of less than 4 bytes are padded. Sized Write commands contain a 32-bit mask field. That field is followed by up to eight 4-byte words that are sent in ascending order.
AMD is working with over 56 partners that are developing LDT host bridges and devices. Before the end of the year, AMD expects a few LDT core logic components to be sampled. A few partners are expected to offer daisy-chainable LDT devices with two LDT links, too. LDT chips are in development to support desktop and mobile PCs and workstations. They're also targeting embedded applications such as servers, LAN routers, and switches. There is no license fee or royalty requirement when companies sign up as a development partner. For more information, contact the company via e-mail at [email protected], or [email protected].