Designers need high-availability systems capability for applications that can't stop when components fail. To achieve this, researchers have combined a new family of CompactPCI cards with extensions to the CompactPCI bus interface that support multiple CPUs, along with a system approach to failover redundancy requirements. The resulting CP2000 High-Availability (HA) system architecture includes the Ultra AXe series of cost-conscious boards, developed by Sun Microsystems Inc., Palo Alto, Calif. These cards were designed to handle multiple levels of hot-swap and hot-plug capability.
Along with these cards, Sun's developers crafted a suite of system-management tools. These tools leverage the intelligent peripheral management interface (IPMI) bus, which is used as an out-of-band communications channel between sensors and control/monitoring circuits. Sun's engineers also enhanced the integrated Solaris 8.0 operating environment with extensions to handle the failover management. And, they plan on extending the real-time kernel of the ChorusOS.
The HA architecture will be released in three phases. Phase I will provide the full hot-swap capabilities and a clean, consistent application programming interface (API). It also will include the definition of the IPMI bus and its API. The Sun Cluster software will permit clustering between shelves for "shared-data" configurations as well. In Phase II, the HA architecture will be integrated with satellite cards, and the system controller will be able to run Solaris or ChorusOS for them. Finally, the dual-host or alternate system controller capabilities will be available in Phase III, providing N+1 availability from the line cards or satellite boards.
The Sun Cluster 2.2 software has been extended to the just-introduced CP1500 motherboard and the forthcoming CP2020 CompactPCI CPU board, which is based on the 300-MHz UltraSPARC IIi processor. Available as part of the CP2000 HA program, the software provides deployment and automated management of redundant computing nodes. In the event of a hardware or software failure, one node can take over for another with minimal interruption. By using standards-based hardware, Sun's designers can greatly reduce the system development cycle at the user company. For example, a telecom carrier would typically employ the system hardware to implement its main public or private branch exchange (PBX) switch.
Based on the failover technique, the cluster software enables the systems to achieve high availability by monitoring both hardware and application software. It also automates recovery if the hardware or software fails and restarts applications on a healthy node.
The HA architecture takes advantage of additional features designed into the CompactPCI cards in the CP1500 series—though the first such card, the CP2020, won't be ready for sampling until the second half of this year. The CP2000 HA program, meanwhile, includes comprehensive hot-swap support. As part of the device driver, this support defines what the designer must do when an enumerate line begins toggling. Previously, developers were on their own in figuring out what to do, and that left open the possibility of incompatibilities between cards.
By building on the idea of a separate IPMI control bus, the HA architecture lets a second system controller reside on the PCI backplane. This circumvents the problem of having two PCI bus masters, since the IPMI bus can be used to permit the two system controllers to agree on who is mastering the bus. In fact, this activity is so important, the HA architecture actually adds a second IPMI bus to perform this operation exclusively.
The Cluster software can leverage the combined capabilities of the hot-swap and IPMI communication/control channel standards. The result is a system management bus that provides an independent management structure to manage system configurations.
In a traditional CompactPCI system, the CompactPCI backplane defines the computer node, and it typically has from 1 to 16 slots. Usually, the CPU is located in slot 1 in such a system, while a system controller card is in slot 8. The remaining satellite slots generally hold the various I/O cards as well as a PCI-to-PCI bridge card to overcome the 7-slot drive limit. Yet the backplane and all the cards typically function as one computer node.
As part of Sun's CP2000 HA program, enhancements provide a standard method of communication between satellite CPU cards and the system controller. This allows the CP2000 CompactPCI cards to be installed in either system controller or satellite slots. That flexibility makes it possible to construct distributed or loosely coupled systems. Another enhancement improved the resilience of the CompactPCI cabinet (shelf) so that it could recover from single card failures. Designers at Sun added comprehensive driver support for hot swap that brings a clean API to the hot-swap features. This provides a clear interface in the device driver, easing the creation of software that can recover from card replacements.
Key to this will be its hot-plug capability (see the figure). When the replacement board is plugged in, the user requests through the various layers of software that the hot-plug service restore power to the slot containing the new board. Then, the hot-plug services software issues a primitive command to the hot-plug driver to turn on the appropriate slot. Once the slot is powered up and the board is reconnected to the bus, the hot-plug service notifies the operating system (OS) that the new board is installed. That way, the OS can initialize the board and then indicate that the board is ready for use.
The IPMI bus, used for I/O card support, offers an out-of-band communication link between the system controller and the satellite card. That gives the controller a second channel for communicating with the card should the main bus get jammed or damaged. Furthermore, changes to the CPU cards now allow dual host or alternate system controllers to reside on the same bus, removing a crucial single point of failure in previous systems (the system controller itself). Finally, application-specific or card-specific failover techniques were added to the hardware.
The shelf-clustering capability is yet another enhancement. It allows the systems to support scaling and further improve reliability through a number of additional techniques. These include a focus on shelves (racks), rather than cards, making it easier to manage the clustering and the implementation of a checkpoint, failover, and restart mechanism for applications running within a shelf. As part of the redundancy support, the system provides alternate pathing. Disk and network operations, then, can be automatically redirected to predefined alternate paths should a failure occur. The failed card then can be serviced without disrupting the system. Additionally, the system can be dynamically reconfigured while it is up and running without a system reboot. That also aids the system manager when online repairs or server reconfiguration must be completed.
Along with the hardware improvements, software at the operating system level (Solaris 8.0 and ChorusOS 4.0) was enhanced to better support the clustering and failover. The CP2000 HA program combines the real-time ChorusOS with the Solaris OS to yield a 64-bit operating system with a hot-swap framework that integrates CompactPCI-specific drivers and optimized backplane communications. The extended Solaris OS will be an integrated program that contains common APIs, management functions, and common Java technology-enabled capabilities that permit the dynamic delivery of IP services. The common set of open APIs and services is shared between the two operating systems.