Thanks to advances in process technology and architecture, suppliers of FPGAs and complex programmable logic devices (PLDs) have created devices with megagate complexities, on-board CPUs, and other integrated features that rival what full ASIC providers can deliver.
The availability of low-resistance multilevel copper metallization solves the interconnect limitations faced by previous generations of high-gate-count devices. The migration down to 0.18- and 0.13-µm design features enables many more gates and memory bits to be integrated on a chip. Smaller design features also let the circuits operate at higher clock rates and lower voltage levels. So, the programmable chips can deliver high performance at low power levels, comparable to what many full ASIC solutions can achieve.
A key reason for using programmable logic is the rapid time-to-market it offers. Today's FPGAs can truly supply designers with a system-on-a-chip (SoC) solution that can be delivered in weeks rather than six to 12 months. But that's not always the case. Programmable-logic designs with a million gates or more can take a considerable amount of time to develop.
To reduce this time, FPGA suppliers are creating and licensing blocks of intellectual property (IP) that can be merged into your circuit design. These blocks range from simple elements such as counters and de-coders to complex functions like SRAM, PCI bus interfaces, and microprocessor and DSP cores.
The ability to pre-integrate CPU or DSP cores, large blocks of RAM, and other complex functions allows the FPGA suppliers to provide performance levels (higher clock frequencies) that the programmable logic elements wouldn't be able to achieve if the same function were implemented using the programmable elements.
Many of these cores are available in "soft" formats, a design file that can be integrated into your hardware-description-language (HDL) description of the circuit and then synthesized. A few that are speed-critical are available as "firm" cores, blocks in which the circuit interconnect has already been predefined.
In a some cases, cores such as blocks of SRAM, CPUs, analog phase-locked loops (PLLs), and high-speed serializer-deserializer (SERDES) I/O ports are pre-integrated on the FPGAs so the blocks can operate at their maximum clock rates. With the advent of 0.13-µm features, 32-bit CPUs can clock at 300 MHz, SERDES ports can handle 3.125 Gbits/s, and memory blocks can boast access times less than 3 ns.
To leverage the blocks of IP and the million-plus gates, FPGA suppliers have paid a lot more attention to design tools. Improved synthesis software, better timing-driven placement-and-routing tools, and more accurate verification software will help designers move concepts to silicon without the iterative loops previously encountered when trying to get timing or routing closure.
The Megagate Battle Intensifies: At the megagate and higher density levels, just three companies are duking it out in the market's high end (see the table). Altera and Xilinx are pulling no punches this year as they introduce next-generation SRAM-based families with multimegagate complexities. At the same time, Actel has pumped up the complexity of its flash-memory-based ProASIC series. Its new ProASIC Plus family of FPGAs will deliver a top complexity of 1 million gates and almost 200 kbits of embedded blocks of SRAM. Each supplier counts gates differently, though, making one-on-one comparisons extremely difficult.
The FPGA suppliers are basically battling it out in four areas to gain designers' favor. The arrays are vying with each other based on the number of available gates, the amount of embedded memory, the availability of embedded processor or other computer support blocks, and the amount and type of I/O pins and ports. In the SRAM-based FPGA arena, Altera and Xilinx have been trying to outdo each other, and system designers are benefiting from this product development frenzy.
For example, Altera's latest family of devices, the Stratix series, will supply up to 114,000 logic elements (each the equivalent of about 26 gates) for a total of about 3 million logic gates. In addition to the gate-level logic, these top-of-the-line devices will include 10 Mbits of dedicated SRAM, 28 support blocks for DSP operations, up to 12 PLLs, and many single-ended and differential I/O options.
Even as Altera introduces the Stratix family, the company is still developing new members for its previous APEX II and Excalibur series. It recently released the high-end member of the Excalibur series, which has a complexity of 38,400 logic elements (about 1 million gates). But in addition to the logic, the Excalibur series includes an embedded ARM9 hard core that can run at clock speeds of up to 200 MHz.
The Stratix arrays are based on a 1.5-V, 0.13-µm all-layer-copper process that delivers system performance levels approaching 250 MHz. The specialized DSP support blocks include complex multiplier-accumulators that deliver 2 gigamultiplication-accumulation operations (GMACs) per second (Fig. 1). Each DSP block can be configured to provide either eight 9- by 9-bit multipliers, four 18- by 18-bit multipliers, or one 36- by 36-bit multiplier. If all 28 DSP blocks are used, the largest Stratix family member's computational throughput exceeds 56 GMACs. Since the blocks don't consume other on-chip resources, the chip's logic portion can be used for control processors, other compute tasks, and many other functions.
Designers at Altera also developed a novel triple architecture approach to the on-chip SRAM. Dubbed TriMatrix memory, the method sets up a hierarchy of three memory types--a fine-grained group consisting of up to 1118 small memory blocks containing 512 bits each, a second group containing up to 520 blocks of 4 kbits each, and a group of up to 12 MegaRAM blocks that each contain 512 kbits.
What's more, Altera's designers pumped up the I/O capabilities, adding true low-voltage differential signaling (LVDS) I/O lines that can handle data transfers at 840 Mbits/s. The LVDS I/O cells have dedicated SERDES circuits as well as differential I/O buffers and data-realignment logic. Single-ended I/O lines include on-chip termination to reduce external component counts and simplify system design.
The Drive To Higher Performance: Building on the popular Virtex FPGA family, Xilinx's recently unveiled Virtex II Pro series of FPGAs enhances the performance and integration levels. The top-of-the-line XC2VP50 will pack about 50,000 logic macrocells (each macrocell is equivalent to about 40 logic gates) for a total gate count of about 2 million. It also will contain four PowerPC405 32-bit CPU cores that can operate at 300 MHz, letting the chip deliver a compute throughput of over 1200 MIPS (Fig. 2). Though not as abundant as on the Stratix devices, the static RAM integrated on the Virtex II Pro will top out at 3.8 Mbits.
But designers at Xilinx have gone several steps further in the I/O and compute areas. For starters, they integrated 16 high-speed serial transceivers (3.125 Gbits/s each) licensed from Mindspeed (a Conexant company). Each I/O channel contains a complete set of user-configurable support circuits that include 8B/10B encoding and decoding, channel aggregation, and support for improved signal integrity across varying pc-board metal trace lengths. The use of the high-speed serial channels as a bus replacement would then save a significant number of I/O lines, simplifying the pc-board design and reducing electromagnetic interference.
Other resources on the Virtex II Pro chips include 168 18- by 18-bit multipliers to accelerate various signal processing algorithms, as well as a dozen PLLs for accurate and stable timing.
To integrate all these features, Xilinx's designers employed one of the most advanced 0.13-µm manufacturing processes yet released for commercial use. The resulting Virtex II Pro chips are the first to combine both hard processor cores and the multiple 3.125-Gbit/s serial channels. As part of the array architecture, the designers incorporated a segmented routing scheme they created called active interconnect. This scheme will ensure predictable performance, permitting the design tools to deliver consistent results.
Designers also put the nine metal layers to good use. The first four levels of copper metallization are used to connect the logic that forms the processor cores and tie those cores into the rest of the chip. This way, the hard core IP blocks are "immersed" in the metallization and deliver their best performance to the rest of the chip. The remaining five metal layers are used to route signals be-tween the cores and the user-configurable programmable logic region.
Though Actel's just-released ProASIC Plus family just passed the megagate milestone, it's the only megagate family that stores its configuration data in on-chip flash memory. The nonvolatile nature of the ProASIC technology enables designers to eliminate the external configuration memory and offer "instant" on capability since the configuration pattern doesn't have to be loaded at startup.
The largest family member, the APA1000, packs 1 million system gates, 198 kbits of dual ported embedded SRAM, and up to 712 I/O lines. Internally, the logic circuits can operate at clock rates as high as 350 MHz. External I/O operations can take place at 150 MHz. A PCI bus interface implemented in the configurable logic can operate at up to 50 MHz. A pair of PLLs are included on the chips for flexible timing and stable clock distribution. Also, the ProASIC Plus architecture continues to leverage the highly granular architecture of the previous ProASIC series, and it has a very fine granularity comparable to gate arrays.
Of course, Altera, Xilinx, and Actel also supply many lower-complexity FPGAs and complex PLDs to suit many system requirements. Several other companies have device families with up to about 580 kgates and some integrated system support features at the lower density levels. These makers include Atmel, Cypress Semiconductor, Lattice Semiconductor, and QuickLogic.
Of these, QuickLogic has the highest densities in the group. Its Eclipse series features up to 583,000 gates and 82 kbits of embedded SRAM. The company also has devices with the largest variety of pre-integrated functions--embedded MIPS processors, PCI interfaces, DSP support, SERDES ports, and Fibre Channel interfaces.
Unlike the SRAM or flash-based solutions produced by Altera, Xilinx, or Actel, QuickLogic's FPGAs are based on the company's one-time programmable antifuse configuration technology. The antifuses are formed in between metal layers above the silicon and thus do not require any silicon area. The FPGAs can then squeeze a lot of gates into very small chip areas.
ASIC vendors, noting the significant market advances FPGAs have made, have developed their own strategies to compete with FPGAs in both flexibility and fast turnaround times. Some of the players in this part of the market include Adaptive Silicon, AMI Semiconductor, Atmel, Chip Express, eASIC, Leopard Logic, Lightspeed Semiconductor, LSI Logic, and NEC. To see what these ASIC vendors are doing, check out "Fast-Turn Alternatives To FPGAs" at www.elecdesign.com.
The choices of logic architectures give designers plenty of options when selecting the best solution. And it won't stop here. Still higher-density FPGAs are on the drawing board, and they will deliver even higher levels of system integration and performance.
|Need More Information?|
Chip Express Corp.
Cypress Semiconductor Corp.
Lattice Semiconductor Corp.
Leopard Logic Inc.
LSI Logic Corp.