The semiconductor industry’s rapid move toward a 90-nm process node to achieve performance and cost benefits puts enormous pressure on power budgets. Decreasing transistor sizes lead to increased leakage current and, as a result, static power. Dynamic power also rises with system speeds and higher design density, but in a more linear fashion. Today, many designs have 50-50 static and dynamic power dissipation. According to International Technology Roadmap for Semiconductors (ITRS) projections, static power is increasing exponentially at every process node, making innovative process technologies imperative.
With the adoption of FPGAs in more markets and systems every year (driven by increasing performance/density and decreasing price), FPGA power consumption within the entire system is critical. Leading FPGA vendors are already adopting new techniques to mitigate static and dynamic power consumption.
FPGAs and ASICs often make up most of the power being consumed in systems. The focus of this article is on FPGAs, but some of the issues and descriptions of power optimization apply to ASICs as well.
A given system or individual components typically have a power budget, which usually falls into two major areas. The first area is a simple, practical one—choosing the power capacity of the supplies being used in a design to meet the system power needs. The second area is about thermal concerns, which need to be understood to keep the system working within the temperature specifications of the various components. To this end, it’s important to know where power consumption comes from in the FPGAs being chosen, and how one can optimize it.
Working Within A Power Budget
Here’s a typical example of a power budget: A board has a power budget of 20 W, with a normal operating environment of 10° to 40°C. Under conditions of a failed fan(s), ambient air above certain components may rise above 70°C. Many component manufacturers have operating conditions that range up to 85°C junction temperature for commercial grade, and 100°C for industrial-grade parts.
Table 1 shows a set of needs that span component environment, such as temperature, power-supply tolerances, etc. Importantly, it also shows the estimated power consumption (from tools and manufacturers’ datasheets) of each FPGA in the design as well as the ASIC and DDR memory.
The goal is to see if the 20-W power budget can be met. Parts like ASICs, DRAM, etc., come from the manufacturers with fixed maximum power consumption. The remaining components on the board or system are the FPGAs. Using power-estimation tools, we can see where we fall and if we need to optimize our FPGAs’ power consumption. Table 2 shows the various resources being used by the designer inside each FPGA.
For FPGAs, Xilinx tools are available to allow for power prediction (Fig. 1). These allow comparison to power targets. Table 1 and Table 2 show that, based on our power-analysis tools, we have calculated 3.5 W, 5.5 W, and 7.0 W for the three FPGAs on our board. The estimated total power for the FPGAs is 16 W, and that of the ASIC and DRAM are 10 W. This totals out to 26 W, which exceeds our budget of 20 W. This is the point where we must learn what consumes power in the FPGA, and what methods of optimization may be available.
Power Consumption In FPGAs
There are two primary areas of power consumption in FPGAs. Static power comes from transistor leakage, and dynamic power comes from voltage swing, toggle rate, and capacitance. Both are important factors in meeting a power budget and power optimization. Therefore, it’s important to know what each factor is and how it varies with different operating conditions.
Static Power And Its Variation With Process, Voltage, And Temperature: Static power is now significant at 90 nm for both ASICs and FPGAs. To boost transistor performance, one needs to lower the device’s voltage threshold (VT), which also increases leakage. Leakage of the 90-nm transistors varies strongly with process, because the VT of the transistors varies due to doping, and the gate length varies due to lithography. This can create larger changes in transistor speed and leakage. Reduced VT or gate length both increase leakage and speed, while the converse is also true. The variation in leakage and static power is about 2 to 1 between worst-case and typical process.
Leakage and static power are also influenced strongly by core voltage (VCCINT), with variations that go approximately as the square and cube, respectively, of VCCINT. Static power shows about a 15% increase with only a 5% increase in VCCINT. Leakage is also very strongly influenced by junction (or die) temperature (TJ).
Since each of these factors—process, voltage, and temperature—have a strong effect on the FPGA’s leakage and static power, it’s important for the board and system designer to understand them and how they might influence total power consumption of the FPGA or ASIC. Gate-to-substrate leakage is also part of total leakage, but isn’t highly temperature-dependent. Figure 2 shows the variation in transistor leakage and, hence, static power in the 90-nm FPGAs, due to process, voltage, and temperature.
Due to the increasing transistor leakage when moving toward a high-performance 90-nm FPGA, our IC designers adopted the use of a third gate-oxide thickness in the transistors of the newest Virtex-4 FPGAs. In previous FPGAs and ASICs, only two oxide thicknesses are used (dual-oxide): a thin oxide for core transistors and a thick oxide for I/O transistors. The use of a third middle thickness of oxide (triple-oxide) and higher VJ in a portion of the transistors dramatically reduces overall leakage and, ultimately, static power.
Dynamic Power And Its Variation With Process, Voltage, And Temperature: Dynamic power is power consumed by transistors and traces that are toggling. The effect, simply put, is due to changing an internal voltage from a logic “0” to a logic “1” (or vice versa) and charging a capacitance to that voltage. The more often this is done, the more power consumed. In FPGAs, the transistors are used for logic and programmable interconnects between metal traces. The capacitance that we’re talking about is transistor parasitic capacitance and metal interconnect capacitance. The formula for dynamic power is:
PDYNAMIC = nCV2f
where n = number of toggling nodes, C = capacitance, V = voltage swing, and f = frequency
All nodes in the FPGA consume power through a combination of charging transistor parasitic capacitance and metal interconnect capacitance. The latter depends on the length of routes in the FPGA, while net node capacitance is determined by the number of switching transistors. Tighter logic packing will reduce the number of switching transistors and minimize routing lengths, which will reduce dynamic power. Table 3 shows variation of dynamic power with voltage swing VCCINT.
Process and temperature cause little variation in dynamic power. Taken together, their affect is less than 5% to 10%.
Optimizing Power Consumption Through FPGA Environmental Changes
To optimize the power consumption of a given design, certain things can be done independently from the design contained within the FPGA. Knowing one’s environment is therefore important.
Temperature: Controlling temperature can help reduce static power. A reduction in junction temperature from 100° to 85°C will reduce static power by about 20%, as shown earlier in Figure 2. This is significant because in some designs, FPGA static power represents a sizable portion (30 to 40%) of the total power budget. The reduction in junction temperature can be achieved by greater airflow and larger heatsinks, which will transfer heat away from the FPGA. Such a reduction also increases reliability.
Voltage: In the previous discussions, it was shown that keeping core voltage at or below nominal will reduce static and dynamic power. Static and dynamic power consumed at VCCINT is often largest power consumer in the FPGA. FPGAs are usually specified to be able to run and meet performance with the power-supply voltage within ± 5% of nominal. Figure 2 and Table 3 show that a ± 5% variation in VCCINT causes about a ± 15% and ± 10% variation in static and dynamic power, respectively. To the extent that the VCCINT power supply can be specified more tightly, it can also be set to be at or even slightly below nominal, rather than being able to have a worst case that’s 5% above nominal.
Optimizing Power Consumption
Power Estimation: Before making FPGA design-related tradeoffs in power consumption, it’s important to know where you are. Based on the designer’s estimates of design size (logic and flip-flops), operating frequency, toggle rates, embedded block utilization, and environment conditions (e.g., temperature); the Web Power tool, shown in the left side of Figure 1, allows initial estimates to be made on a given design’s power consumption. It doesn’t rely on detailed information about the design, such as exact routing, placement, and usage.
The XPower tool is a detailed power analysis program, which lets the user input stimulus vectors for the design. And along with the information from the actual routed and placed design, the tool calculates power consumption much more accurately. This tool’s output is shown in Figure 1 (right side).
FPGA Design Techniques To Reduce Power: Also available to the designer are several design-specific techniques that can lower power consumption. These include constraining logic to a small area where possible, setting synthesis flags to reduce area, and minimizing layers of logic. Pipelining is also a good technique, because it allows a higher timing constraint to be set. This, in turn, will reduce capacitance and, hence, dynamic power.
Setting Placement And Timing Constraints: With the Xilinx Floor Planner, users can create placement constraints. Also, a more-sophisticated tool, called PlanAhead, allows the designer to observe hierarchy in a design, and group sets of related logic into small areas. Using the placement and grouping constraints reduces the physical area, which allows higher performance to be achieved while minimizing routing capacitance and reducing dynamic power consumption.
Other techniques that can be brought to bear include constraining timing in a design. Synthesis tools permit the designer to input timing constraints, as do the routing and placement tools. If one raises the target timing constraint, especially the clock target, the router will try harder to meet it through more aggressive placement and routing efforts. The net effect is to minimize routes, which reduces routing power.
Clock Gating And Other Partial Shutdown Or Reprogramming Methods: Some commonly used ASIC techniques also reduce dynamic power consumption. One of these techniques is to use clock multiplexing, which makes it possible to turn off sections of the FPGA. For example, a hardware feature in the Virtex-4 FPGA and its predecessors is a clock-gating block, which provides a smooth way to turn off or on a global clock net. Better than clock enables on flip-flops, this method allows the entire large toggling clock net to be gated off, saving power on the net and the flip-flops.
Some types of designs, especially those used in battery-powered applications need to consume power only at certain times. To accommodate those, one can turn off clocks and lower the core voltage to the minimum level that still allows FPGA data retention. In the Virtex-4, this is 0.9 V. At this level, static power is reduced by greater than 60% from where it is at the nominal 1.2-V level.
A way to shrink the FPGA size required for a given set of tasks is to use the Dynamic Reconfigurability Port (DRP), which is available in the Virtex-4 and other FPGAs. If there are several functions that don’t need to coexist, this port makes it possible to reload only a portion of the FPGA. In doing so, a much smaller FPGA may be chosen, reducing static power.
Using Embedded Blocks: Another way to reduce power consumption is by using embedded blocks. While it’s more work in some cases to instantiate special blocks, the Virtex-4 FPGAs have a number of pieces of hard-IP, which are essentially ASIC gates. Some of these functions are, in fact, automatically synthesized by some of the modern synthesis tools. These new blocks have between 5X and 20X lower power than programmable-logic and programmable-interconnect implementations. The embedded blocks reduce static power by not having extra transistors (as in programmable logic), and not using programmable interconnect transistors. They reduce dynamic power via several characteristics: use of only metal interconnects versus metal and programmable interconnects; reduction of trace lengths; reduction of extra node capacitance due to a lack of pass transistors; and minimizing of layers of logic.
Power-Based Routing Optimization: Other tools coming soon will make it possible to optimize a design for power consumption. The first of these will perform automatic capacitance minimization without the need for the designer to enter faster timing constraints. Additional tools will allow power-optimized synthesis and power-optimized placement.