Control Your Failures In Time And Keep Customers Happy

Oct. 15, 2001

3 min read

No one likes it when a system fails, particularly your customer. With the advent of smaller 0.13- and 0.10-µm semiconductor geometries, and the shift from logic-based to memory-dominant chips, designers must watch out for soft errors. These result whenever the charges generated by extraneous sources exceed a critical charge required to flip data stored in a bit cell. Common causes of this discharge problem include alpha-particle bombardment, metal coupling, and system noise.

Until recently, soft errors were mainly a problem in military and aerospace ap-plications because these errors increase with altitude and exposure to radiation. That's no longer the case with the spread of memory technology into commercial, consumer, and industrial applications, where downtime caused by soft errors can be very costly.

Except for DRAMs, memory cells in geometries larger than very deep submicron (VDSM) used to be relatively insensitive to alpha-particle radiation. On the other hand, DRAMs designed in 0.13-µm or smaller technologies are very susceptible to soft errors. Even SRAMs are becoming sensitive to soft errors because of their small memory bit cells, where a logic state of 0 or 1 is represented by a very small charge.

The chip's packaging, or just cosmic radiation, can generate alpha particles. These are doubly ionized helium particles that can penetrate 20 to 30 mm of silicon and create electron-hole pairs. A single alpha particle can create as many as a million electron-hole pairs for about 25-µm silicon penetration.

With memory cells being so small at the VDSM geometries seen today, these electron-hole pairs can accumulate to create a charge that disrupts stored information. The amount of charge that represents a bit value in a 0.13-µm SRAM is about 1/16 of what's required in 0.25-µm geometries, making the cells almost an order of magnitude more susceptible. DRAMs, however, have always had less charge and been more vulnerable to soft errors, even before the advent of VDSM technology.

Failure rates due to soft errors are measured as Failures in Time (FITs). An FIT is one failure in a billion hours. In a system with 50 components, if the system can only fail once a year, every component must meet a design specification of 2281 FITs.

Fortunately, soft errors can be prevented or corrected because, although there's data loss, there's no damage to the underlying memory devices. Through advancements in packaging, memory design, and process techniques, soft-error rates from alpha-particle bombardment can be avoided.

In packaging, the use of special radiation-absorbing die coats, materials with lower lead content (lead emits alpha particles), and keeping the bumps in a ball-grid array away from the memory are all very effective ways to reduce soft errors. Memory design techniques can be improved to reduce soft errors as well. Increasing transistor size will increase cell storage capacitance, and adding RC delays will increase cell-flip times.

Improved process techniques using triple wells limit accumulation of electron-hole pairs, and adding a second polysilicon layer increases capacitance. To avoid coupling, memory vendors place restrictions on over-the-block routing of embedded memories, balance bit lines for reduced differential coupling, provide power shielding, and manage signal-integrity issues.

At the system level, when prevention isn't enough, a cure is necessary. This involves adding error detection and correction logic to on-chip memory. By controlling FITs, you'll have a happier and more successful customer.