No room for error: Why do neutrons pose serious reliability challenges for SRAM-based FPGAs?

Sub-atomic particles in cosmic galactic rays enter the Earth's atmosphere and collide with atoms of atmospheric gases. These collisions produce a wide variety of sub-atomic particles, including high-energy neutrons. A substantial quantity of these neutrons then penetrates the atmosphere and ultimately reaches the Earth's surface.

It's been discovered that these high-energy neutrons can cause flip-flops and memory cells in modern semiconductor electronics to change state, representing a considerable cost and risk for today's chip manufacturers (see the figure). In applications from safety-critical to consumer, there exist significant costs and liabilities associated with these events. With higher-density, increased complexity, and the achievement of advanced process technologies, this issue becomes increasingly challenging.

In response to industry-wide concern, several published industry studies and papers discuss neutron effects in discrete memory ICs. The effect of neutrons on programmable logic devices, which use memory cells to determine functionality, is a major concern. In response, iRoC Technologies, an independent third-party test company, conducted a comprehensive series of investigations in 2003 to determine the configuration memory failure rates of different FPGA architectures.

Since that time, we've seen the introduction of new FPGAs using previously unavailable advanced process technologies. In December 2005, iRoC was again commissioned to perform further testing. In compliance with the JEDEC JESD-89—the industry-standard specification for the measurement of neutron effects—testing was performed on ProASIC3, the latest flash-based FPGAs from Actel, and SRAM-based FPGA architectures from other leading architectures. The test team developed designs used to detect behaviour indicative of changes in FPGA functionality called logic errors, or single-event upsets (SEUs).

The tests demonstrated that the most recent flash-based FPGAs aren't subject to loss of configuration due to neutron effects. They also found that advances in semiconductor manufacturing technology have had a detrimental impact on the reliability of SRAM-based FPGAs, making them more vulnerable to neutron-induced configuration loss. This should be a major concern for designers of high-reliability systems.

It was a commonly accepted fact that neutron-induced configuration loss in SRAM FPGAs was due to the altitude where the FPGA-based system was deployed. According to iRoC Technologies, SRAM-based FPGA architectures are vulnerable to neutron-induced configuration loss—not only under high-altitude condition, as traditionally believed, but also in ground-based applications. When high-energy neutrons penetrate SRAM memory cells, such as those used in SRAM-based FPGAs, it's highly probable that a functional failure will cause the device to operate in an unpredictable manner.

The tests also provided an indication of how frequently a configuration upset results in a logic error in each SRAM FPGA (see the table). Shown in failures in time (FITs), where one FIT is defined as one failure in 109 hours, integrated circuits typically are required to have FIT rates lower than 100. In high-reliability applications, component engineers will look for overall FIT rates around 10-30, a number that's impossible to achieve with a SRAM FPGA.

When the contents of a memory device are changed without damaging the device, yet the device became inoperative, it's called a "soft error." Even more serious are "firm errors," which indicate when an SRAM-based FPGA memory cell is corrupted. Firm errors are permanent errors that could change the character of the FPGA. These errors aren't easily detected or corrected, and they're not transient.

If a configuration bit upsets and changes state, it could alter the entire functionality of the device. That may result in significant data corruption or the forwarding of spurious signals into other circuits in the system. In extreme cases, a firm error can become a "hard error" and cause the destruction of the device itself or the system containing the device.

Thus, a failure in an SRAM FPGA's configuration memory is a potentially catastrophic event that can result in an unpredictable and uncontrollable FPGA. Techniques used to recover SRAM-based FPGAs from neutron-induced configuration loss usually don't work. Some need system-level design intervention to monitor device configuration status, as well as reload and restart the FPGA if an upset is detected. This introduces an unacceptable degree of latency in real-time systems.

A costly alternative is triple module redundant (TMR) implementation—using three FPGAs instead of one—with additional upset immune circuitry to control majority voting, configuration monitoring, and reload/restart circuits. This approach can provide a route to managing multiple simultaneous errors; however, a serious overhead is incurred by tripling memory and implementing the required control logic.

Although it's difficult to quantify the true threat presented by upsets, the industry has agreed that the trend for errors is clearly set to rise with increasingly smaller process geometries.

The latest iRoC results demonstrate that neutron-induced configuration upsets pose a real threat to product quality and reliability, an important consideration as semiconductors proliferate into more of the high-reliability systems we depend on every day.

Soft errors are already of great concern in devices built in current SRAM-based technologies, and will become a major issue as device sizes continue to shrink. These errors often drastically reduce the system availability. Often times, soft-error avoidance is strongly required to maintain the system availability at an acceptable level.

In conclusion, immunity to neutron-induced errors must be on a designer's short list of selection criteria. Even as process geometries shrink, the underlying architectural benefits of flash-based FPGA technologies will continue to protect the integrity of both ground-based and airborne applications

Ravi Pragasam is senior manager for military and aerospace product marketing at Actel Corp.

TAGS: Digital ICs
Hide comments

Comments

  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Publish