Heterogeneous Cores Turn Off Transistors To Save Power

Is turning off transistors the problem or the solution? A recent article in IEEE Computer raised the specter of “dark silicon,” stating that modern processors can’t run all their transistors at the same time without overheating. As a result, transistors are being wasted. I agree that designers will be turning off even more transistors in future processors. But these processors will do so intentionally, as part of a trend toward specialized cores.

Dark Silicon Rising

The concept of dark silicon comes from a recent academic study estimating that 21% of the transistors on a next-generation chip will be turned off at any given time to avoid exceeding the chip’s power limits. According to the study, this portion could rise beyond 50% in just a few generations, ultimately limiting processor performance.

Dark silicon invokes memories of dark fiber, those fiber-optic cables constructed during the 1999 boom that lay unused for years because data traffic did not match the original optimistic predictions. Processors are not like fiber-optic networks, however. For one thing, transistors cost a heck of a lot less than cables.

Moore’s Law enables a doubling of the number of transistors in each generation. As a result, the cost per transistor is cut roughly in half every two years. In today’s 28-nm technology, transistors have become so cheap they’re nearly free. Economically, it now makes sense to put a number of functions on a chip even if customers rarely use many of those functions.

Worrying about transistors that are not being powered is like worrying about the fact that your sofa is probably not being used right now. The issue isn’t whether transistors are dark. For years, processor designers have turned off parts of the chip that are not in use at any given time. For example, many PC users rarely, if ever, use the transistors in their floating-point unit (FPU), but this is not a concern to them.

Power Limits Require Changes

There is no doubt that we are running into limits on the amount of power that a chip can burn. Different systems (smart phone, laptop, server) have different power limits, but everything today is running at its limit. Therefore, the only way to increase performance is to improve performance per watt, also known as power efficiency.

The big lever used in the past decade to improve processor power efficiency has been a shift from unicore to multicore designs. Several small CPU cores are inherently more power efficient than one large CPU core. We are reaching a limit, however, as most software applications don’t scale well beyond four cores.

The future lies in heterogeneous multicore. This approach consists of constructing several different types of cores, each optimized for a particular task. Existing products show efficiency improvements of up to 10 times when using optimized cores in place of a general-purpose CPU such as an ARM or x86.

This heterogeneous approach is already used to a limited extent in PC processors (which have a separate graphics core) and to a larger extent in smart-phone processors (which have graphics, video, and cellular cores), proving its value. But future processors may extend this approach to include dozens of different cores, increasing the types of functions that could be addressed in this fashion. These cores might include visual-computing accelerators, voice-recognition engines, or HTML processors.

Even CPUs themselves are becoming specialized. ARM is promoting a concept called Big.Little that pairs two CPUs that support a common software design but different hardware designs. The Big processor is designed for maximum performance and kicks in during CPU-intensive operations. The Little processor, though, is optimized for power efficiency. It handles light loads such as e-mail or texting. By matching applications to the right CPU cores, the operating system can extend battery life.

Specialized Cores Improve Efficiency

One downside of these specialized cores is that they are not used except for their specific task—the video core will sit idle (dark) until you actually start watching a movie. You could view this as “wasted” silicon, but as noted above, transistors are so cheap these days that it doesn’t matter. From a power standpoint, it is actually ideal for the software to turn off the most of the chip and just run the video core, which is designed to deliver video at the lowest possible power, when you’re watching a movie.

This type of processor is like a house that includes several rooms. Each room is designed with a specific purpose in mind: sleeping, eating, washing, relaxing, or working. Sometimes only one room is in use and the rest are dark (literally, at night). At other times, a few rooms might be used, but never all at once. Instead of trying to optimize the usage level of each room, homeowners focus on optimizing the room for its intended activity.

The combination of architectural techniques such as heterogeneous multicore and new advances in transistor design should enable a continual increase in processor performance and capabilities for at least the next decade. The industry has been dealing with power limits for the past decade while maintaining a high rate of performance growth, so I am optimistic that we can continue on this path for at least the next decade.