Highly accelerated life testing (HALT) and highly accelerated stress screening (HASS) are environmental test methods used to compress time scales from months or years to hours or days. Although actually quite distinct, the two processes often are discussed together.
HASS must be preceded by a HALT process that identifies possible weaknesses in a product. An accelerated stress screen can be designed based on HALT findings.
During HALT, the unit under test (UUT) is subjected to increasing levels of stress while its performance is monitored. The objective is to continue until the unit fails. By analyzing the failure and understanding the mechanisms responsible for it, the design can be modified to make the product more robust.
After each repair, the UUT again is subjected to increasing stress until failure, which should occur at successively higher levels. The process is continued until all the weak spots have been eliminated.
HALT is not intended to encroach into the area of specification improvement. The specification is assumed to be correct for the intended application. Rather, HALT should ensure that a product’s performance exceeds the specification by as much as possible without making extensive, and probably expensive, design changes.
The result of this process is depicted in Figure 1. The specification remains the same before and after HALT. But for many types of products, both the operate and destruct limits have been extended considerably.
This may not always be possible. Products such as disk drives operate without much margin, and only limited improvements will be found through a HALT program. In this case, HASS will precipitate latent faults and improve the reliability of units entering the field.
For many other products, significant increases in the operating margin are obtained. This is important because the greater the margin, the less sensitive the product will be to small component variations or occasional high stress in the operating environment. For these products, a 100% HASS is unlikely to provide much benefit, and it may be more appropriate to sample production units.
The word robust describes a product’s capability to endure stress beyond the initial specification level. Webster’s New World College Dictionary defines robust using terms such as strong, hardy, sturdy, stamina, and full of vigor. As the title for Figure 1 states, HALT testing is performed to improve robustness.
Going Beyond the Prototype
Clever engineers can get a single model of almost anything to work. The challenge is to replicate the prototype’s operation in thousands of production units. This is where HASS comes in. Using information learned from the HALT exercise, accelerated tests can be designed to target the previously exposed failure mechanisms and design weaknesses. If latent faults exist, the tests will precipitate them. At the same time, HASS is a production test and should not actually weaken units destined for shipment.
The safety of a HASS program can be evaluated by running the same unit through the screen 20 times. The screening process should not have removed excessive life from the product. The effectiveness of the screen is tested by repeatedly running products through it to see if further faults are precipitated on successive passes.
HASS is very different from a traditional burn-in procedure, which typically did not involve vibration. Running completed products in an elevated temperature environment may expose calibration drift, but a constant high temperature generally doesn’t correspond to the failure mechanisms seen in the field.
“In the pilot stage of production, you want to make sure that your manufacturing processes haven’t taken away from the hard-won ruggedization that you achieved in development,” commented Larry Edson, a senior engineer for advanced reliability methods at the General Motors North American Car Division. “You screen the early production units with HASS, using a combination of rapid thermal change and broad spectrum vibration so that every failure mechanism has a chance to go into resonance and fail.”
Performing a 100% screen on early production units provides feedback on both the manufacturing process and the components you have bought from various suppliers. But, as high-volume production ramps up, you cannot afford to do a 100% screen, nor can you assimilate all the information that would result from one.
When your production processes are under control, you will want to migrate to a highly accelerated stress audit (HASA) strategy. “Some companies will organize a lot-sampling strategy,” Mr. Edson continued. “They determine a statistically relevant but minimum sample size. They’ll run their audit and then release products in batches. Other companies treat it as an on-going statistical process control (SPC) situation. If their yield rates start to decline, they try to improve whatever was going awry.”
At first glance, this approach sounds flawed. For example, suppose one of your suppliers began using a lower-quality component and it affected the performance of your product. How many of the substandard units would escape into the field? How many of those units would fail with the attendant high replacement costs and customer dissatisfaction?
It seems like there could be problems with a HASA system used in this way, but maybe not. “Since you are running a sampling process tuned to your rate of production, you should be aware a problem has occurred,” Mr. Edson explained. “On the positive side, you are screening at a stress level much higher than your customer will ever do. Because you have established such a large safety margin through HALT, it’s not necessarily certain that you will have problems in the field.
“If you have good margins, screening is not going to cause a big problem. It doesn’t mean that if you do 100% screening only half of your products will pass. You can operate an audit process with confidence because of the high margins. You’re only looking for abnormalities that you have no way of anticipating.”
Eliminating a Bottleneck
Older-style test chambers were limited to relatively slow transitions between temperature extremes. Not so with newer 6-DOF systems. Many of these chambers use liquid nitrogen to provide 70°C/min or faster temperature gradients. The result of higher heating and cooling rates is greater throughput.
Simultaneous exposure to vibration and temperature cycling also can save time when troubleshooting production problems. Eric Andersen, previously a reliability engineer at Hewlett-Packard, described his use of a HALT chamber to identify elusive failure mechanisms in different types of products.
“A power supply vendor was experiencing a high field failure rate that involved a large filter capacitor. The automatic assembly equipment was supposed to apply room-temperature vulcanizing (RTV) adhesive on a couple of tall heatsinks and on one end of this capacitor to act as additional support. It turned out that the capacitor received some RTV almost by accident, and it was this hit-or-miss process that determined whether the supply survived in the field. This problem was costing a lot of money. Using a 6-DOF chamber, I found it in less than two hours.
“Using other types of shock machines, we couldn’t find things that the new 6-DOF machines expose. For example, there was a problem with mounting tabs breaking off memory modules. The PCBs with the memory chips are made in one place and then shipped to multiple users at different locations. Some of the tabs were broken after shipment, but there was no obvious reason.
“We checked our production line to make sure the robots weren’t hitting the tabs during pick-and-place. And, we put accelerometers in the cargo planes to determine the maximum stress the modules were exposed to. We even implemented a 100% screen before we put them in the box and shipped them. Vibration testing using an electrodynamic shaker was cracking the module PCBs, but the tabs weren’t failing.
“Using the 6-DOF table, we were able to cause tab failure because it was due to coincidence of vibration in different axes. We were getting a sharp impact spike that was causing the tabs to break. The actual reason that only some of the modules failed finally was correlated to imperfections in certain mold cavities.”
In these two examples, HALT/HASS principles and experience suggested the use of a 6-DOF chamber for production troubleshooting. The specific problems were quickly identified and corrected and the source of field failures eliminated. However, there is a downside to using chambers with such fast thermal-cycling capabilities.
“In all the work I’ve done with these chambers, I’ve never seen rapid cycling cause problems on its own. Many people think that the very fast rates initiate problems, but actually they just help you find problems sooner,” Mr. Andersen added. “Unfortunately high air velocity and high temperatures are associated with fast thermal cycling. We had to start using Teflon cables because I kept melting ribbon cables.”
Return to EE Home Page
Published by EE-Evaluation Engineering
All contents © 2001 Nelson Publishing Inc.
No reprint, distribution, or reuse in any medium is permitted
without the express written consent of the publisher.
May 2001