- Switch fabric: multiple end nodes, switch nodes
- Also PCI switch fabric (100% PCI compatible)
- End nodes: bridge between fabric, local buses
- Switch nodes: intermediate nodes in fabric
- Supports thousands of endpoints
- Three routing methods: PCI Addressing, Path, Multicast
- Eight Class of Service (CoS) levels: Asynch, Synch, Isochronous, Multicast, Hi-priority (Asynch, Isochronous), Address Routed, Provisioning, Special
- Variable-length frames to 128 bytes (with 16-byte lines)
- Built-in error detection, rerouting
- Enumeration assigns node IDs, sets up Source, PCI Address routing
- 622-Mbit/s LVDS line signaling rate
- 5 Mbits/port (four LVDS bidirectional pairs, full duplex)
- Six ports/switch
- 15-Mbit/s max switch-node bandwidth (25 Mbits/s for Multicast)
- Hot-plug capable connections
- Bandwidth reservation for real-time traffic
- Link drives 5 meters of Category 5 cables, connectors
- Line-credit-based flow control
- Link bundling (multiple links/connection)
- Redundancy (retransmission, redundant paths)
- PCI-Bridge StarFabric chip—bridges PCI bus to fabric
- StarFabric switch chip—six-port fabric switch chip
- Dual StarFabric Bridge chip CompactPCI board with PMC
- StarFabric Switch Pedestal Board
- Dual StarFabric Bridge chip CompactPCI board with J3 connections, PMC
- Bustronic 21-slot Hybrid Backplane (CompactPCI)
Switch fabrics are here to stay. They will deliver the next-generation base connection layers for data centers, Telecom/Datacom severs, and box-level connections. Of the emerging switch fabrics, StarFabric has targeted multi-faceted deployment. It provides a switch-fabric base for current PCI-based systems, as well as for H.110 and other standardized busing systems. It also delivers a pure switch fabric for base implementations.
Unlike InfiniBand, which represents a new connectivity paradigm, StarFabric comes on as a transitional connection base (Fig. 1). It can be deployed with today's technology, providing a switch-fabric connectivity and bandwidth for existing PCI (mainly CompactPCI) systems. System engineers can simply use standard PCI to connect to the switch fabric and implement it as a high-bandwidth, redundant, adjunct busing system.
Designers can use the StarFabric to connect to existing Telephony and Telecom busing systems, like H.110/H.100 or ATM. Here, the switch fabric can function as a switch, linking multiple H.110/H.100 bus systems together, or as a bridge connecting such busing systems to a standard PCI bus connection on the other side of the fabric.
StarGen intends to make StarFabric a standard. It's forming a StarFabric Trade Association to license and make available the StarFabric protocol. StarGen is working with PICMG, the creators of CompactPCI, to develop StarFabric-compatible boards too.
StarFabric is deployable. StarGen is sampling two chips, the StarSwitch and the PCI-to-StarFabric Bridge. The switch is a six-port switch node that makes up the StarFabric (Fig. 2). Acting as an end node, the bridge chip bridges a PCI bus connection to the StarFabric (Fig. 3). Another chip, the T8150 from Agere, bridges the H.110/H.100 telephony bus to the StarFabric. Moreover, Agere is working on an ATM bridge chip to bridge ATM WAN connections to StarFabric.
StarFabric implements a sophisticated switch fabric composed of end nodes and switch nodes. The end nodes bridge to existing busing systems, while the switch nodes make up the switch fabric itself, which provides a high-bandwidth connection fabric that's redundant. It has built-in QoS, error checking, hot-swap capability, and recovery. Plus, the high-bandwidth connection fabric features multiple levels of connectivity: it can serve as a PCI fabric, using PCI addressing and bridging rules, or it can function as a standalone switch fabric with built-in source or multicasting addressing
On the simplest level, StarFabric can be employed as a PCI fabric with its end nodes (PCI-to-StarFabric bridges) linking to PCI buses. In the PCI mode, the fabric supports full PCI addressing and bridging. It handles both transparent (standard PCI host) and nontransparent (embedded) bridging. With nontransparent bridging, a computer system appears as a virtual PCI peripheral and keeps control of its own memory space and peripherals.
StarFabric defines a three-layer protocol with a Fabric, Link, and Physical layer (see the table). The Fabric layer supports fabric-level processing: frame formatting, routing, error recovery, port linking/mapping, link bundling, and flow control. The Link layer handles error coding/detection, frame CRC, and control messaging. The lowest layer, the physical layer (PHY), handles the electrical signaling, clocking, encoding (8b/10b), and PHY error detection. Featuring portability, the protocol can run on different media.
The switch fabric supports three routing methods: Address (PCI), Source, and Multicast. The first, Address Routing, uses PCI addresses to route variable-length frames through the fabric (Fig. 4). No additional host software is needed, and the fabric behaves like a collection of PCI bridges.
Source (or Path) Routing builds on the flexibility of the switch fabric (Fig. 5). The routing information is stored in the end nodes and attached to the frame headers as it's passed into the fabric. Determined on enumeration, the paths are designed for maximum flexibility.
The third method, Multicast Routing, enables the transmission of a single frame to multiple destinations. It allows an end node to connect to multiple end nodes for a system-wide or partition-wide transmission of a frame. Multicast Routing only applies to Writes; there's no Multicast Read. Multicast frames are routed to nodes that belong to a particular Multicast group, which is identified by a Multicast Group ID field.
Inside the fabric, Line Credits manage flow control at the nodes. Instead of a node transmitting blindly, waiting for an acknowledge or failure, the destination node controls the frame transmission transaction. It issues line credits to the transmitting node for different CoS. If a transmitting node has enough line credits to transmit a packet to the receiving node, it will do so. If not, it will wait until it has accumulated enough credits, sent by the source node as its buffers free up.
Additionally, bandwidth can be reserved on the node-to-node links for both Multicast and isochronous traffic. This reservation system follows the StarFabric principle that ensures link resources for transmissions, eliminating blocked transmissions due to the lack of received resources (buffers).
PCI Address Routing
The fabric operates as a collection of PCI devices and bridges for PCI Address Routing. Each node uses the frame's Address field to determine the next node routing. It decodes the address against a set of address ranges and control bits. The nodes support 32- or 64-bit addressing. The switch node implements the standard set of PCI-to-PCI configuration registers and a Port Map, which maps each link to a PCI address range for next-node routing. During enumeration, the registers and Port Map are set up. StarFabric defines Channels, which are basically PCI address space segments. The PCI address range has 256 segments in the PCI address range, and segment 255 is used to set PCI registers in the nodes.
The switch nodes also implement Smart Address Routing, which supports routing to intermediate nonPCI address-range switch nodes. It permits routing through nodes that aren't part of the PCI hierarchy but make for a shorter path.
On initialization, the fabric performs an enumeration. End nodes are either leaf or root nodes. There's only one root node (actually two for redundancy; they are identified by strapped pins). The root node starts an enumeration to the nodes that it connects to, and those nodes will spread enumeration through the fabric. The root also assigns Fabric IDs (FIDs) to the nodes. It assumes the first FID, then assigns a FID to each connecting node via a special "You Are" frame. The root links return to it a special "I am" frame accepting the FID. These nodes also send "You Are" frames to the nodes they connect to, assigning FIDs. Those nodes do the same until all nodes, both switch and end, are assigned FIDs.
FIDs consist of a parallel fabric number and a path specification. The path specification is a very clever way of defining the routing from the root node to the FID node. It consists of eight 3-bit fields: a turn count, and seven turns. A turn is simply a port count in a switch from the input port to the connecting output port, as in "two turns or ports to the left." Therefore, each node stores an addressing pointer that can be used to route a frame from that node to its root node.
Also during enumeration, the nodes set up the line credits defining their node connections. For example, a destination node will send a special frame to its source nodes delimiting the available line credits (internal frame buffer space) for each CoS class on interconnection. Using line credits, the transmitting nodes will only send frames when there's room to receive them at the destination node. Thus, the fabric itself can't saturate, and the end nodes won't initiate frame transactions unless they can be received.
Frames are the basic StarFabric unit of transmission. They carry Read/Write data, as well as control and error data. Variable in length, frames are built up from multiple 16-byte lines, to a maximum of 128 bytes (eight lines). Frames also incorporate a Frame Header, and they start and end with a Link Overhead, which is both a prefix and a suffix.
The Link Overhead prefix consists of a Header line, a data byte line, and a line that incorporates a 7-bit Frame Sequence Number, a Line Debit Type flag, a 16-bit CRC code, and a Line Credit field. The Frame Sequence Number is assigned on transmission from a node and may change from node to node. The Line Debit Type identifies the CoS or Path method, and the CRC check includes the Frame Sequence Number, so it may change on every node transmission.
The Frame Header fields include:
- Frame Size (number of 16-byte lines)
- CoS (defines 1-of-8 CoS)
- Path (frame route)
- Transaction Number (6 bits, assigns number to end-node transaction)
- Request Transaction Number (Transaction Number from the initiating node, included by the responding node, as in a Write acknowledge or Read Completion frame)
- Orphan Byte Enable & Count (defines trailing unused bytes at end of frame)
- Offset (42-bit Dword-aligned relative offset for Path and Multicast)
Plus, for Address Routing (PCI), the Frame Header includes fields for Address, Channel Number, and Target Region.
The fabric supports reports of path, chip, or signal events. When an error occurs, or some notification is needed, the switch nodes generate path and chip events. Path events include node notification when a path-routed frame encounters a down port, a nonexistent output port, or a path not ending at an end node. Path events are directed to the initiating edge node that initiated the frame. Chip events may announce an error condition or provide notification of an informational event.
The edge nodes generate signal events whenever one of the nodes detects the assertion or deassertion of an interrupt signal pin and this event needs to be propagated to another edge node in the fabric. Supported signal events include INTA#, INTB#, INTD#, PME#, ENUM# asserts and desserts, and SERR asserts. These signals handle PCI compatibility. An event frame is typically sent for every signal assertion and deassertion.
StarFabric And CompactPCI
Unlike InfiniBand, which requires a software and hardware infrastructure, StarFabric is now deployable on CompactPCI or PCI-based systems. Designers can use StarFabric as an underlying adjunct busing system, interfacing their existing subsystem components to StarFabric through the Star PCI-Fabric Bridge chip, the SG 2010. This chip bridges to a 32/64-bit, 33/66-MHZ PCI bus, and to two StarFabric 5-Gbit/s ports.
Additionally, engineers can use StarFabric as a hybrid backplane. With StarFabric, they can overcome the bandwidth and seven-board/segment limitations of CompactPCI. Bustronics is working on a 21-slot hybrid backplane. The backplane holds 21 CompactPCI cards, organized into five segments with two standalone native node slots. The CompactPCI boards are interlinked with StarFabric connections and can support an underlying StarFabric.