Electronic Design

Switch-Chip Fuels Third-Generation InfiniBand

Low cost, a small footprint, and blazing speed keep InfiniBand ahead of the pack.

It's quite a feat: The tiny InfiniScale III MT47396 switch chip can support eight 30-Gbit/s 12x InfiniBand ports. On top of that, it can be configured as two dozen full-duplex, nonblocking, 10-Gbit/s 4x InfiniBand ports. That's less than $40 per port.

The chip, developed by Mellanox Technologies, also handles any combination of 12x or 4x ports for a total aggregate throughput of 480 Gbits/s. It consumes only 18 W, though, making it an ideal centerpiece for an InfiniBand switch fabric. Integration of 96 2.5-Gbit/s serial-deserializer (SERDES) units reduces power requirements, cost, and footprint. At the same time, it simplifies board design and reduces the bill-of-materials cost.

High performance and high density are the reasons behind InfiniBand's increased recognition. Virginia Tech is building a $5 million supercomputer cluster based on G5 dual PowerPC processors from Apple and InfiniBand technology from Mellanox Technologies. This cluster is an excellent example of how a standards-based InfiniBand is lowering the cost of computing. Other InfiniBand-based supercomputers are in the works.

InfiniBand is well adapted to high-performance clusters with its low latency under 200 ns, low overhead, and high bandwidth. It has a lead on Ethernet, which has higher overhead, and is three generations ahead of the yet-to-be-released Advanced Switching for PCI Express.

InfiniScale III's performance improvement comes from the use of on-chip memory and hardware support of key operations like flow control and partition key checking. The latter restricts packets to a particular partition, allowing an InfiniBand switch fabric to host a number of protected partitions.

The InfiniScale III is the fastest and densest InfiniBand switch around with three times more power than the existing eight-port 4x InfiniBand switch chips currently available. But ports are only part of the story, as the InfiniScale III does more than replace three second-generation chips. It actually replaces seven chips, because two must connect the other five. To complicate matters, that connection is at 10-Gbit/s 4x speeds. The InfiniScale III has 480 Gbits/s of bandwidth available inside the chip.

A two-stage switch fabric built with the InfiniScale III can handle over 1500 ports. A three-stage switch fabric handles more than 5000 ports.

The local management interface for the InfiniScale III now utilizes a transaction-based command interface via a built-in channel adapter port. This allows in-band and out-of-band management to use similar protocols and support code. The CPU interface works with a PowerQUICC 8/16-bit bus. There's also a Bus Management Interface that's compatible with I2C. Furthermore, the on-chip InfiniRISC embedded processor provides customization capabilities to developers.

The InfiniScale III puts InfiniBand at the cutting edge. It is part of a balanced approach to connectivity. Host adapters are currently running at 4x speeds, providing bandwidth that matches current high-end processor performance. A 12x host adapter would be overkill at this point, but 12x links are ideal between InfiniScale III chips. This makes InfiniScale III's 12x support key to large, multistage fabrics. IP-over-InfiniBand support can make InfiniBand the only interface required for a blade server. InfiniBand is reaching its performance goals while many other technologies are still in their planning or testing stages.

The InfiniScale III MT47396 is priced at $949. It's available in a 961-pin, 40- by 40-mm, HFCBGA package. The InfiniScale III MT47396 will also be available in a 1U 24-port MTS2400 switch.


  • 8 12x 30-Gbit/s ports, 24 4x 10-Gbit/s ports, or any combination of 4x and 12x ports
  • Auto-port width configuration and data-rate adaptation
  • 480-Gbit/s bandwidth
  • 96 2.5-Gbit/s SERDES units
  • Low, sub-200-ns latency, nonblocking via cut-through switching
  • PowerQUICC 8/16-bit command interface
  • I2C-compatible Bus Management Interface
  • 48k entry Unicast forwarding table
  • 1k entry Multicast forwarding table
  • Inbound and outbound partition key checking
  • Hardware CRC generation and flow control
  • Bad-packet filtering
  • Link-packet buffer-management interface
  • Integrated management support including Integrated Subnet Management Agent (SMA), Integrated General Service Agent (GSA) with Performance Management Agent (PMA), and Baseboard Management Agent (BMA)
  • JTAG debug interface

  • Hide comments


    • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

    Plain text

    • No HTML tags allowed.
    • Web page addresses and e-mail addresses turn into links automatically.
    • Lines and paragraphs break automatically.