For design flexibility, FPGAs are unsurpassed. But they have typically suffered performance or power penalties to achieve that distinction. Now, those penalties exist no more. This year’s best FPGAs can take a leading role in highvolume and high-performance designs.
The structures that typical FPGAs use to provide their configurability add overhead to their internal design. To achieve functional density and performance levels comparable to those achievable in custom designs using ASIC techniques, FPGAs often need to be one process generation ahead of their ASIC equivalent.
This adds cost and increases the power requirements of FPGA-based designs. So, developers may choose FPGAs for development and early production but often switch to ASICs to gain higher performance, lower volume cost, smaller packaging, and lower power as products go mainstream. FPGA vendors aren’t taking this lying down, however. New offerings are pushing back the cost, size, power, and performance barriers.
PROASIC 3, IGLOO HEAT UP THE MARKET
Actel has developed the “nano” versions of the ProASIC3 and Igloo FPGA families specifically to challenge ASICs for high-volume mobile applications. The nano versions take power consumption down as low as 2 µW and package size as small as 3 by 3 mm. More than 50 nano variations are available with volume pr icing below $1. Functional densities range from 10k to 250k system gates.
These nano FPGAs offer a variety of other features that support the mobile application space as well. Core and I/O operating voltages can be 1.2 or 1.5 V for battery operation. A lowpower FlashFreeze mode incorporates bus hold capability to simplify putting the device to sleep when not active but still in-circuit. And, its I/O offers Schmitt trigger inputs as well as hotswap capability.
GOODBYE, GLOBAL CLOCKING
While Actel has pushed back the cost, power, and size boundaries of FPGA offerings, Achronix Semiconductor has broken the performance barrier. By using a novel interconnect architecture it calls picoPipe, the Speedster FPGA family promises a threefold performance boost over other FPGAs to achieve the equivalent of 1.5-GHz operation. The secret is the elimination of global clocking.
The Speedster devices use traditional FPGA design structures such as reconfigurable logic blocks formed from a set of four-input lookup tables. This traditional structure ensures that designers can use familiar tools and techniques for mapping their design into the FPGA. While the structure is familiar, however, the routing and logic details are different, using a fine-grained pipeline and asynchronous propagation instead of clocking data through registers (Fig. 1).
Each connection in the picoPipe structure uses a differential pair to carry the data forward along with an acknowledge line from the destination back to the source to indicate that valid data has been received. Each logic block holds its output data steady until after receipt of the upstream acknowledgement. This clock-like functionality lets data propagate through the logic as fast as possible and permits multiple data values to be “in flight” simultaneously.
Once filled, the logic pipeline moves data through as fast as it can be extracted from the final stage. And, moving data in and out of the Speedster FPGA can be brisk. The first device in the family boasts 20 lanes of 10.3-Gbit/s SERDES lanes and four independent 1066-Mbit/s DDR2.3 memory controllers. This enables it to keep pace with such interfaces as 10G Ethernet, PCI Express (PCIe) generation 2, Sonet, Serial RapidIO, and 6-Gbit/s SATA.
KEEPING DESIGNS FLEXIBLE
For many systems, growth in the demand for memory has outpaced the industry’s ability to increase the density of memory chips. To get the most out of their processors, designers find themselves needing to either increase the number of dual-inline memory module (DIMM) memory sockets or use modules with the highest-density memory chips available. Either option will add significantly to the design’s cost.
MetaRAM’s MetaSDRAM chip set provides another design option (Fig. 2). It sits between the socket interface and the memory to make a collection of smaller DDR3 SDRAMs look like a single, large device to system hardware and software. The chip set also presents this memory to the system as a single electrical load, which ensures that bus loading does not compromise system speed.
These abilities allow the creation of x4 or x8 DIMMs populated with as many as 144 lower-cost, mainstream DDR3 SDRAMs without requiring designers to make hardware or software changes. These DIMMs can thus achieve as much as four times the capacity attainable using the memory chips alone, providing a lower-cost option for incorporating high-density memory.
Providing designers with more options is also an attribute of the Gen-2 compatible PLX Technology PEX 86xx PCIe switch family. These switches, available in 16-lane/16-port, 12-lane/12-port, and eight-lane/eight-port configurations, include four built-in DMA channels.
The DMA channels use a descriptorbased ring approach to control data handling that supports quality of service. Transfers can be as large as 128 Mbytes in either 32- or 64-bit width, and a single channel can move data fast enough to saturate a x8 PCIe link at 4 Gbytes/s in one direction. The 256 descriptors can be stored in the switch or in system memory.
The availability of these channels not only helps offload the system processor of memory transfer tasks, it also eliminates the need for DMA support in the host or its chip set. The switch chips can provide high-speed data transfers between I/O devices or memories connected to any of their ports, increasing the range of connectivity options in a system design as well as increasing performance.