Bloomberg Businessweek’s “The Big Hack: How China Used a Tiny Chip to Infiltrate U.S. Companies” article alleges Chinese-based corruption of the supply chain by adding a chip to the motherboard (see figure). This is likely replacing or intercepting communication from the baseboard management controller (BMC) and the serial flash-memory chip that holds the BMC’s code. One of the most common BMC chip families comes from Aspeed, a Chinese vendor. A number of chips populate this family, and they’re used on the motherboards in question.
The BMC is a typical Arm-based system-on-chip (SoC). It has on-board peripherals and some memory, but it can utilize off-chip memory as well. Often the boot code is contained in a serial flash device or a parallel NAND flash device.
What’s the Best Method to Fix the Problem?
In theory, it would be easy to replace the application in flash memory with compromised code, but this approach has a problem. The flash memory is often reprogrammed by the vendor when it receives the motherboard to allow for delivery of the hardware with the latest code. This is often done using a connection directly to the memory rather than booting the system and programming in new software, although that’s a possible method as well. However, it could be circumvented by a compromised system.
Another approach would be to replace the memory chip with a custom version that included compromised code. The challenge with this approach it twofold. First, creating such a chip is a major undertaking. Second, the size and capacity of the chip would be limited by the chip it replaced. This may be less of an issue given the size reductions available using new technology.
The baseboard management controller (BMC) is an SoC that can boot from off-chip flash or serial memory. Interposing a chip between the serial memory would be one way to take control of the system.
The third, and evidently the chosen approach, is to add a custom chip between the serial memory and the BMC. This has the advantage of not changing the flash memory, but it does mean creating a new chip which, as noted, isn’t an easy task. It could be a relatively simple chip, and it’s even possible to use an existing microcontroller.
The challenge is that the traces on the motherboard for the serial memory would have to be altered. According to Bloomberg, contractors building the boards and populating them were coerced to make these additions. The small size of the chip and minimal alterations would prevent a casual observer from noting a change. Hiding the chip near or under other hardware makes that task even more difficult.
The advantage of this approach is it would work with a wider range of motherboards. The attacker’s chip could be programmed to handle different BMC chips. Also, the amount of code that the attacker needs to include on their chip doesn’t have to be large since it can take advantage of the code that the BMC chip would load from the flash memory on boot. Attackers simply need to modify the code to suit their needs. The existing code is likely to include a small RTOS, communication support, etc., that could be exploited. Such an approach isn’t easy to implement but very practical.
There have been more details revealed lately and it appears that the motherboard circuit board did not have to be modified. Likewise, the additional chip may simply be a standard serial memory chip that was added to a location designed for the chip and left unpopulated. This is a common design approach to provide more options. For example, a TPM security chip is often an option for a server motherboard. The chip is simply left out if the motherboard will not provide that option.
Leaving out a single chip is common, but so is leaving entire sections of a printed circuit board (PCB) unpopulated. It would be impossible for someone without a circuit diagram and bill of materials to determine what should or should not be on a PCB.
The hack was supposedly caught, not by observing the changes to the motherboard, but by network traffic that was abnormal. A more sophisticated implementation might delay compromised communication until much later making it much harder to detect.
BMC at the Center of It All
The BMC is normally tied into one or two Ethernet ports. Normally one is designed to be connected to a dedicated administration network that’s often isolated so that traffic can’t get on the internet. Network managers are able to use this network to manage a server farm; the operating systems and applications that run on the server can’t even detect that this is being done. A BMC typically provides simulated serial ports and disk drives that are indistinguishable from the real thing. They can also control and modify the boot memory for the main processor.
A second BMC port is often piggybacked onto a network port used by the main processor. It would be connected to a more public network or the internet. This “feature” allows the public network to be used for administration, simplifying the operation of small networks by not requiring a parallel network for each server. Encrypted communication can prevent attacks even in this case, but it’s not the same thing as an isolated network. Of course, the compromised BMC could also gain control of the main processor, though that’s a much harder programming chore. Compromising the BMC is much easier.
The BMC has control of both ports; therefore, a compromised system could use the public connection even while the BMC application was using the administration network in the normal fashion. The BMC will need an IP address on the public network in addition to the one needed for the processor, so that could be a way of detecting a problem assuming the compromised system would operate in such a way.
Some SoCs are designed to address this type of attack, but most are not. Essentially, each SoC would need to have its own crypto identification that’s used to verify or decrypt boot code. This would prevent booting of a compromised system. It typically requires a signed version of code that’s unique to an SoC.
The advantage of this approach is that the SoC creation process needs to be secure, which tends to be much easier to manage than the assembly of boards, chips, and other hardware to create a server or other device. Unfortunately, this approach is relatively new, not available to most OEMs, and not used by most vendors—yet.
Antivirus software, even something implemented in the processor’s boot code, is incapable of detecting or preventing an attack like this from compromising the system.
If the attack winds up being real, then it could force many companies to reevaluate all aspects of their supply chain. It might be a good idea to do that anyway, because even if this instance was a hoax, the next one may not be. The method of attack is valid, although it’s difficult to implement unless one has influence over part of the supply chain. Still, there are many ways to do this.
System security is getting much better in general with features like secure boot—if they’re used. Attackers will likely continue to exploit holes or attack targets such as the BMC that bypass normal security measures. It simply means that security needs to be applied to all aspects of system design, deployment, and even during the building of a system.