When it comes to FPGAs, it’s never been more exciting than it is right now. The wide range of choices from platforms to tools to soft-core processors has never been larger. Flash FPGAs can target applications that once were the realm of only low-power microcontrollers, while high-performance 28-nm FPGAs can beat out the fastest processors on the market for many applications.
Arm hard-core processor use is on the rise. The Cortex-M3 is found in Actel’s SmartFusion FPGA (see “FPGA Combines Hard-Core Cortex-M3 And Analog Peripherals”). It joins a wide range of soft-core processors like the Cortex-M1 (see “Actel/ARM Develop 32-Bit Processor For FPGAs”), the popular 8051, and Freescale’s ColdFire V1 (see “Cold, Dense, And Gratis MCU Core Targets FPGAs”).
On the other hand, there are processor cores specifically designed for their host platforms like Altera’s NIOS II (see “Latest NIOS CPU Targets 32-Bit Control Needs”) and Xilinx’s MicroBlaze (see “FPGAs Pushing MCUs As The Platform Of Choice”). More designs are utilizing these kinds of platforms, and multicore designs are becoming more common.
FPGA platforms have seen a couple of twists recently. For example, the Achronix Speedster has pushed the envelope when it comes to speed (see “1.5-GHz FPGA Takes Clock Gating To The Max”). It incorporates picoPIPE interconnects between lookup-table (LUT) blocks (Fig. 1). These asynchronous first-in, first-out (FIFO) connections allow the top clock speed to be limited by the delay of a single LUT instead of the LUT chain of logic between latches of conventional FPGA designs. The Speedster targets high-performance applications.
Tabula’s ABAX uses a dynamic reconfiguration approach by design where the underlying fabric can change every clock cycle (see “FPGAs Enter The Third Dimension”). The ABAX TimeSpace architecture provides latches to retain the system state between transitions (Fig. 2).
Each transition changes the underlying interconnect, providing a new FPGA layout for each state or layer. Each region can have up to eight layers with the last providing state information to the base layer. If all eight layers are used, then the amount of logic available to the designer is eight times that of the base chip.
The ABAX architecture supports multiple regions (Fig. 3). This allows the number of layers and the cycle time for each region to be different. The cycle time of a region affects power consumption, with a lower cycle time using less power.
28 nm: The Cutting Edge
Xilinx and Altera have announced their 28-nm technology. It is in the hands of select users but will be generally available in 2011. As usual, the chips are faster, have more LUTs, and use less power per LUT. They also stretch the limits of serializers/deserializers (SERDES) at the high end.
Altera’s Stratix V GT is pushing 28-Gbit/s SERDES to support optical modules. The Stratix V GX’s SERDES run at 12.5 Gbits/s, and they are designed to handle 10G standards such as 10-Gbit Ethernet.
Also, Altera continues to support Embedded HardCopy Blocks, essentially ASICs on an FPGA. This technology was used to incorporate the hard logic such as PCI Express support. The high end has more than 1 million adaptive logic elements and 512 variable precision DSP blocks based around a pair of 18- by 18-bit multipliers. Most other DSP approaches require external logic if the DSP block uses fewer bits. The blocks can be configured as a single 27- by 27-bit multiplier as well.
Xilinx announced a new trio of chips for its 28-nm Series 7 family: the Artix, Kintex and Virtex. The families split out by interface and performance, with the Virtex remaining the high-end solution.
The Artix hits low-cost platforms with up to 352k logic cells, 700 DSP slices, 3.75-Gbit/s SERDES, PCI Express Gen1 x4 support, and an 800-Mbit/s memory interface. The Kintex moves up a step and is available in a flip-chip package. It has up to 407 logic cells, 1540 DSP slices, 10.3125-Gbit/s SERDES, PCI Express Gen2 x8 support, and a 2.133-Mbit/s memory interface.
At the top end is the Virtex 7 with up to 2 million logic cells and 3960 DSP slices. The SERDES run at 13.1 Gbits/s with a serial bandwidth of 1.9 Tbits/s. It supports PCI Express Gen3 x8 interfaces, and its memory interface runs at 2.133 Mbits/s.
Xilinx’s unified architecture uses the same architectural building blocks for all families. The differences lie in its performance. Also, the architecture’s power requirements allow easier migration between families when necessary. It offers a more consistent development environment as well.
Altera, Lattice Semiconductor, and Xilinx provide RAM-based FPGAs that load their configuration from an external memory. For example, Lattice’s range of FPGAs includes the LatticeECP3 with a 3.2-Gbit/s SERDES that requires less than 100 mW (see “Low-Power SERDES FPGAs Save Power”).
The LatticeECP3 uses 65-nm technology from the company’s foundry partner, Fujitsu. The FPGA fabric also has very low static power requirements. Its sysDSP slices are cascadable with up to 320 18- by 18-bit multipliers. On-chip 128-bit AES decryption is supported for secure boot support.
The RAM-based FPGAs also support partial runtime reconfiguration where a portion of the FPGA is changed while the rest of the FPGA remains unchanged. For instance, a soft-core processor could reconfigure some of its peripheral subset by loading new intellectual property (IP). Typically, the reconfigured region is isolated until the new configuration is loaded so the FPGA can perform a range of functions, although not all at the same time.
The partial reconfiguration differs from Tabula’s per cycle reconfiguration. In Tabula’s case, the reconfiguration occurs each cycle but the transformation sequence does not change. For Altera and Xilinx, the reconfiguration would occur as new functions are required for a longer period of time, usually much longer than the reconfiguration time. The reconfiguration time is small but dependent on the size of the reconfiguration region.
The hardware has been able to support partial reconfiguration. But the big change lately has been support within the design tools, making it significantly easier to define and debug using this feature.
Actel provides flash-based FPGAs. They take longer to reprogram, but they start instantly and don’t require off-chip storage for configuration information. This can be advantageous, especially in single-chip applications such as custom microcontroller replacement.
The low-power, compact, and inexpensive Actel Igloo highlights this at the low end of the spectrum (see “FPGA Costs Half A Buck”). Available in package sizes down to 3 by 3 mm, it can run using 1.2- to 1.5-V power sources.
The 130-nm-based Igloo also supports Actel’s Flash*Freeze mode, which locks down the interfaces, SRAM, and registers using as little as 2 µW. It can enter and exit this mode in less than 1 µs, putting it on par with many lower microcontrollers while retaining the flexibility and features of the FGPA fabric.
Actel’s SmartFusion is another standout FPGA with a hard-core Cortex-M3 processor on-chip (see “FPGA Combines Hard-Core Cortex-M3 And Analog Peripherals”). It also has a very powerful analog subsystem that can run on its own and interact with the FPGA fabric as well.
The analog compute engine has access to an array of 12-bit successive approximation register (SAR) analog-to-digital converters (ADCs) and 12-bit sigma-delta digital-to-analog converters (DACs) with analog front ends, including temperature and current monitors (Fig. 4).
The SmartFusion chips are available with up to 500k system gates. The Cortex-M3 has up to 512 kbytes of flash, 64 kbytes for SRAM, and a memory protection unit. A 10/100 Ethernet media access controller (MAC) is optional. The processor has the usual complement of microcontroller peripherals such as an eight-channel direct memory access (DMA), I2C, serial peripheral interface (SPI), and serial ports.
FPGA Design Tools
While FPGA hardware continues to progress by leaps and bounds, the software changes are really making FPGAs more amenable to developers. More low-range and mid-range designs are being started with FPGAs. Likewise, soft-core processors are showing up in more than half the design wins, making FPGA design and software system design key components that must be combined at design time using integrated tools. Multicore solutions are also on the rise incorporating asymmetrical as well as symmetrical multiprocessing designs.
The key features showing up in FPGA design tools for processors include the automatic generation of header files and driver support for soft peripherals found in the FPGA fabric. Likewise, automatic interconnection between peripherals and hardware buses allows a’la carte menu selection by developers who are more familiar with software than FPGA hardware design.
The FPGA hardware vendors supply their own development tools, but third-party multivendor tools are available from sources such as Altium and Mentor Graphics. These multivendor tools allow designers to target different hardware platforms, but normally they take advantage of these tools because of their breadth of support.
For example, Altium Designer incorporates printed circuit board (PCB) design and layout in addition to FPGA support. Also, Mentor Graphics provides ReqTracer requirements tracking and ModelSim simulation support. Its design flow can target ASICs as well. This is especially handy when FPGAs may be just the first step in the delivery process. Mentor’s Precision line of RTL and Rad-Tolerant tools addresses FPGAs in addition to ASIC platforms.
Also, Altium Designer works with Altium’s NanoBoard 3000 development platform (Fig. 5). The NanoBoard accepts FPGA modules with chips from different vendors. Altium Designer handles system design from the logic level to the PCB layout and routing through C++ application software design (Fig. 6). The entire design flow is handled from a single user interface.
Altium Designer has a range of new features, from a 3D single-layer PCB editing mode to the Testpoint manager. All soft-core processors are wrapped by the Wishbone OpenBUS wrapper, allowing them to work with any peripheral. Altium supports vendor soft cores like Xilinx’s MicroBlaze in addition to a range of other soft-core processors.
Xilinx’s ISE Design Suite 12 is designed to handle large designs and make it simple for individual developers to tackle complex FPGA designs (see “FPGA Design Tool Brings More Modularity To FPGA Design”). This release also supports partial reconfiguration of regions in real time.
Additionally, ISE 12 introduced intelligent clock gating with automated analysis and fine-grain logic slice optimizations. This enables the system to determine when logic does not affect downstream logic so the clocks for the former can be turned off, saving power. Intelligent clock gating can reduce dynamic power requirements by as much as 30%.
Xilinx has also standardized on Arm’s open ABMA4 AXI 4 interconnect protocol. This works for any soft-core processor but meshes particularly well with Xilinx’s future hard-core plans, although the Series 7 has yet to gain a hard-core chip.
Altera is targeting large scale-designs as well. Its automatic layout tool now supports up to eight processors. The Design Space Explorer (DCE) also can span multiple machines, including server farms. This approach is quite handy because it can utilize multiple configurations and compare the results of each with respect to aspects like space utilization and power requirements.
Another feature that addresses large designs, the Rapid Recompile support, attempts to maintain prior routing from an existing design, significantly speeding up FPGA compile times. If more than 5% of the design changes, the system performs a full recompile. The partial reconfiguration support can also take advantage of Rapid Recompile support.
Altera’s Quartus II now includes the graphical Design Partition Planner, which designers can use to lay out sections of the FPGA (Fig. 7). The tool additionally now shows timing paths between blocks. A new transceiver toolkit helps with real-time transceiver interface design as well as bit-error rate testing. Meanwhile, the new TimeQuest Timing Analyzer tool can provide detailed path timing analysis. It utilizes Synopsis Design Constraints (SDC).
Many designers actually will use Quartus II’s SoPC Builder system integration tool first. It provides graphical selection of modules as well as the integration of these modules. Also, it supports memory-mapped and streaming interface interconnect fabrics.
The DSP Design Framework and DSP Builder now support the 28-nm portfolio. It is tightly integrated with MathWorks’ Matlab and Simulink and provides a similar user interface. The new Floating Point Portfolio brings floating-point IP into the mix where fixed point is often the norm. This support can take advantage of the variable precision DSP blocks.
Lattice Semiconductor’s LatticeDiamond replaces ispLEVER. It can utilize multicore platforms to accelerate layout and design flow. Optimization options for a project are collected together in an entity called a strategy. The company provides a library of predefined strategies. LatticeDiamond also includes the new Timing Analysis view and supports TCL scripting. Third-party support includes Synopsys Synplify Pro for Lattice Synthesis and Aldec Active-HDL Simulation.
The LatticeMicro32 from Lattice targets software development for the 32-bit Harvard architecture LatticeMicro32 soft-core processor. The Eclipse-based tool includes the Micro System Builder for the selection of peripherals and configuration of interconnections. The system includes an instruction set simulator.
Actel’s Libero integrated development environment (IDE) supports the company's range of platforms, which includes the Cortex-M3 hard-core SmartFusion, Igloo, and ProASIC 3 series FPGAs. Also, its SmartTime provides gate-level static timing analysis information. An SDC file specifies timing constraints (Fig. 9).
Further, Actel’s SmartPower identifies static and dynamic power-consumption regions. It can analyze a system by nets, gate, I/Os, and more. A detailed power annotation report can note when the analysis hits 100% coverage. SmartPower can also provide estimate battery runtimes based on a design’s operational mode distribution (Fig. 10). It highlights the advantages of Actel’s Flash*Freeze mode.
Libero’s third-party support includes the Synopsys Synplify Pro AE and Synphony HLS AE as well as the Mentor Graphics Precision RTL AE and ModelSim AE HDL Simulator. IAR’s Embedded Workbench and Keil’s MDK integrated development environments support the SmartFusion Cortex-M3 platform.
FPGA designs are getting more complex, but the software tools have improved significantly over the years. The increasing importance of power consumption and the incorporation of soft-core and hard-core processors in system designs have raised the importance of tools like Actel’s SmartPower power estimation tools as well as software development tools.
Smaller, low-power FPGAs also are pushing the need for soft-core processors into embedded battery operated applications. Soft-core MicroBlaze processors are also popping up in space with Xilinx’s Rad-Hard Virtex-5QV FPGAs. The key to successful FPGA solutions is getting the right combination of hardware, firmware, and software. This ranges from the tiny 3- by 3-mm Actel Igloo to the 28-nm monsters with the latest high-speed SERDES.