InfiniBand switches can push more data with links that run at the 10-Gbit/s speed of 4X InfiniBand. Handling this kind of power isn't easy, though. The dual-channel InfiniHost MT23108 chip from Mellanox Inc. solves this problem with a building block that allows host processors to deliver 20-Gbit/s bandwidth via two InfiniBand ports with minimal overhead. Host processors have difficulty delivering a fraction of this bandwdith using other technologies.
High-performance In-finiBand operation is growing more important as its use increases in server blades and such standards as the PICMG 3.2 Advanced Telecommunications and Computing Architecture (ATCA) from the PCI Industrial Computer Manufacturers Group (PICMG). ATCA defines a passive backplane with an InfiniBand switch fabric.
The InfiniHost chip minimizes host interaction by supplying intelligent host data transfers (Table 1). The host simply posts InfiniBand mapping details to a queue, and the chip handles memory transfers directly as a PCI-X master.
Additionally, the MT-23108 performs all InfiniBand protocol processing in hardware (Table 2). Low overhead is key to efficient server operation because it lets host processors concentrate on the application instead of handling the InfiniBand link.
The InfiniHost host-channel adapter (HCA) chip becomes the third leg of a 4X InfiniBand product line that includes the MT43132 InfiniScale switch chip and the MT21108 InfiniBridge target channel adapter (TCA). InfiniHost and InfiniBridge serve similar purposes, but InfiniHost is designed for general-purpose devices, like server blades or InfiniBand routers. In addition, the chip can be implemented in higher-performance storage devices and subsystems. The Mellanox chips are compatible with IBA (InfiniBand Association) spec revision 1.0a, so they should work with other vendors' InfiniBand products.
On the other hand, the InfiniBridge typically bridges InfiniBand to other devices, like a Gigabit Ethernet controller. Mellanox's InfiniPCI software offers a way to link PCI adapters across InfiniBand. The InfiniHost chip is designed for native InfiniBand operation, although it can utilize PCI adapters connected to an InfiniBand network via InfiniBridge.
The MT23108's support for two InfiniBand links is key. It allows a redundant switched-based hierarchy to be used for fault-tolerant systems, where InfiniBand will be found initially (Fig. 1). Both channels can be employed simultaneously for maximum throughput, with a fallback to one link if a switch or link fails. Full-mesh interconnects are possible with multiple MT23108 chips, but this tends to be impractical for large systems with dozens of hosts.
InfiniBand Links: By implementing transport in hardware, the MT23108 keeps data moving. Its eight serial-deserializers (SERDES) are split into two groups, one for each 4X channel (Fig. 2). Each SERDES operates at the 1X InfiniBand speed (2.5 Gbits/s). Together, they reduce system chip count by eliminating off-chip physical layers (PHYs).
The other major hardware component is the address translation and protection table. This handles packet mapping at wire speeds and minimizes transfer latencies. Hardware memory protection also prevents invalid access to host memory. Hardware checking is applied to InfiniBand message sizes, too.
The chip uses off-board memory for tables and program data in addition to on-chip caches for recently employed information. On-demand page virtual memory greatly simplifies off-board memory use. An on-chip double-data-rate (DDR) SDRAM interface handles 4 Gbytes or more of error-correction code (ECC) memory in four DIMM modules. The interface supports a transfer rate of up to 2.5 Gbytes/s. The InfiniHost supports up to 16 million InfiniBand queue pairs.
Host Side: The InfiniHost has a 133-MHz PCI-X interface for management and transferring data, and a glueless connection to receive control information. The glueless connection works with the Motorola PPC 860 family, including the 855T and 823. But it will work with other processors with minimal external circuitry. It uses an 8-bit data bus. The PCI-X bus operates between 50- and 133-MHz speeds in PCI-X mode and 0- to 66-MHz speeds in PCI mode. The chip supports 32- and 64-bit addressing.
Host overhead is significantly reduced because the MT23108 handles Work Queue Element packet generation and reception. This covers the validation of access rights and construction of InfiniBand packets. Completion events can initiate cleanup actions.
The InfiniHost chip also operates in transparent mode, enabling software bridging. In this case, the packets are forwarded to the host, which sets up the area of memory for use. When the MT23108 is configured to act as an InfiniBand router, an entire InfiniBand packet is delivered to the host, including the packet's CRC value, so it can be examined and forwarded if necessary.
Furthermore, the chip can map local physical memory to a remote InfiniBand node. Access to remote memory is tunneled through the InfiniBand fabric without additional host interaction other than the initial queue pair setup and configuration. The MT23108 transforms memory accesses into InfiniBand messages and takes advantage of memory via the PCI-X interface.
InfiniRISCy Business: Although the host processor deals with the PCI-X bus and control interface, the MT23108 has an on-board processor as well. That InfiniRISC processor features a way to keep additional protocol processing on board, which reduces the load on the host processor. The Infini-RISC CPU isn't dedicated to any particular service, but it can supply functions like protocol translation between InfiniBand and Fibre Channel.
The InfiniRISC CPU can use all In-finiHost services and off-board DDR memory where programs and data are stored. The memory is shared with the InfiniHost hardware services, like the queues and lookup tables.
The chip's JTAG interface offers access to the processor and other chip parts. System initialization can be handled by the I2C-compatible interface or attached flash memory on the processor. Plus, the boot process can program the on-chip flash memory.
The InfiniHost should significantly increase InfiniBand acceptance. It reduces system design overhead and host processor support overhead, while keeping throughput near the max. Most host processors expend a lot of cycles servicing a 1-Gbit/s Ethernet link. But with the InfiniHost, the host processors will be able to keep the InfiniBand links saturated without having to dedicate themselves to simply moving data through the links.
Price & Availability
Implemented in a 0.18-µm CMOS pro-cess, sample quantities of the InfiniHost MT-23108 HCA will be available later this month. The MT23108 comes in a 580-pin L2BGA package. InfiniHost Reference Boards will be available to OEM partners for $5000. Available now, the InfiniScale MT43132 costs $458, and the MT21108 In-finiBridge runs $196, both in 10K quantities.
Mellanox Inc., 2900 Stender Way, Santa Clara, CA 95054; (408) 970-3400; www.mellanox.com.