MIL-HDBK-217 vs. HALT/HASS

For the last three decades, MIL-HDBK-217 has been widely used to predict product reliability.1 Today, however, highly accelerated life testing (HALT) and highly accelerated stress screening (HASS) are being recognized as effective tools to intensify product reliability.2 The military standard and HALT/HASS cover different areas in the reliability world. Is there any correlation between them?

Manufacturers usually make reliability predictions based on failure models described in MIL-HDBK-217, Bellcore TR-332, or some other model before the product is manufactured or marketed.3,4 But when a product is delivered to customers and then field failure reports begin to arrive, the preliminary reliability prediction sometimes is not validated by real-world failure reports.

Some manufacturers have said the prediction model could be widely inaccurate when compared with the performance in the field.4,5 What makes the discrepancy between the reliability prediction and the field failure report?

The Purpose of MIL-HDBK-217

This military standard is used to estimate the inherent reliability of electronic equipment and systems, based on component failure data. It consists of two basic prediction methods:

  • Parts-Count Analysis—Requires relatively little information about the system and primarily uses the number of parts in each category with consideration of part quality and environments encountered. Generally, the method is applied in the early design phase, where the detailed circuit design is unknown, to obtain a preliminary estimate of system reliability.
  • Part-Stress Prediction—Uses complex models composed of detailed stress-analysis information as well as environment, quality applications, maximum ratings, complexity, temperature, construction, and a number of other application-related factors. This method tends to be used near the end of the design cycle, after the actual circuit design has been defined.

The general failure mod-el in MIL-HDBK-217 and Bellcore TR-332 is of the form:

where: lb = the base failure rate, described by the Arrhenius equation
pQpEpA, … = factors related to component quality, environment, and application stress

The Arrhenius equation illustrates the relationship between failure rate and temperature for components. It derives from the observed dependence of chemical reaction, gaseous diffusion, and migration rates on temperature changes:

where: lb = process rate (component failure rate)
E = activation energy for the process
k = Boltzmann’s constant
T = absolute temperature
K = a constant

Detailed models are provided for each part type, such as microcircuits, transistors, resistors, and connectors.

The Merit of HALT/HASS

HALT is performed during design to find the weak reliability links in the product. The applied stresses to the product are well beyond normal shipping, storage, and application conditions. HALT consists of:

  • Applying environmental stress in steps until the product fails.
  • Making a temporary change to fix the failure.
  • Stepping stress further until the product fails again, then fix it.
  • Repeating the stress-fail-fix process.
  • Finding fundamental operational and destruct limits of the product.

HASS is performed in the production stage to confirm that all reliability improvements made in HALT are maintained. It ensures that no defects are introduced due to variations in the manufacturing process and vendor parts. It contains the following:

  • Precipitation screen for latent defects to be turned into patent defects.
  • Detection screen to find patent defects.
  • Failure analysis.
  • Corrective actions.

The precipitation and detection screen limits of HASS are based on HALT results. Usually, the precipitation-screen limits are located between operational limits and destruct limits and the detection screen limits between spec limits and operational limits, as shown in Figure 1.3

Figure 1. Hass Limits Selected From HALT Data

HALT/HASS has been proven to find latent defects that would very likely precipitate in end-use applications, causing product failures in the field. As a result, the HALT/HASS process can effectively intensify product reliability.

Why MIL-HDBK-217 Turns Out Inaccurate Predictions

The prediction techniques described in MIL-HDBK-217 for estimating system reliability are based on the Arrhenius equation, an exponentially temperature-dependent expression. But many failure modes in the real world do not follow the equation.

For instance, mechanical vibration and shock, humidity, power on/off cycling, ESD, and dielectric breakdown—all independent of temperature—are common causes of failure. Even some temperature-related stresses, such as temperature cycling and thermal shock, would cause failures that do not follow the Arrhenius equation.

More importantly, the reliability of components in many electronic systems is improving. Consequently, component failure no longer constitutes a major reason for system failure. But, the MIL-HDBK-217 model still tells us how to predict system reliability based on part failure data.

Figure 2 illustrates the nominal percentage of failures attributable to each of eight predominant failure causes, based on data collected by the Reliability Analysis Center.6 The definitions of the eight failure causes in Figure 2 are as follows:

Parts—22%: Part failing to perform its intended function.

Design—9%: Inadequate design.

Manufacturing—15%: Anomalies in the manufacturing process.

System Management—4%: Failure to interpret system requirements.

Wear-Out—9%: Wear-out-related failure mechanisms.

No Defect—20%: Perceived failure that cannot be reproduced upon further testing. These failures may or may not be actual failures; however, they are removals and count toward the logistic failure rate.

Induced—12%: An externally applied stress.

Software—9%: Failure to perform its intended function due to a software fault.

To illustrate the disparity, consider the following: A circuit board containing 338 components with six component types is used in a mobile radio system.4 The failure rate of the MIL-HDBK-217 prediction is 1.934 failures per million hours, as shown in Table 1. The field behavior of the board, however, shows 19 failures in a total operating time of 4,444,696 hours, resulting in a field failure rate of 4.274 failures per million hours. The deviation 4.274 – 1.934 = 2.34 failures per million hours was not covered by the MIL-HDBK-217 prediction.

Table 1. Contribution to Failure Rate of Each Component in MIL-HDBK-217 Prediction

Component Ceramic
Capacitor
Diode Bipolar
IC
Resistor Bipolar 
Transistor
Tantalum
Capacitor
Failure    Rate
Calculated
Failures
0.004 0.009 0.05 0.052 1.225 0.594 1.934

Actually, many field failures are caused by unpredictable factors, often the main reasons for reliability problems in today’s electronic systems. But those unpredictable reasons can be successfully precipitated, detected, and eliminated during a HALT/HASS process.

Conclusion

Before making a reliability prediction, be certain of one of the two following items:

  1. The failure modes described in the prediction model account for the vast majority of system failures. If not, go to b.
  2. Prediction is made after reducing unpredictable defects by performing HALT/HASS.

References

  1. MIL-HDBK-217, Reliability Prediction of Electronic Equipment, U.S. Department of Defense.
  2. Hobbs, G., Accelerated Reliability Engineering HALT and HASS, John Wiley & Sons, 2000.
  3. Bellcore TR-332, Issue 6, Reliability Prediction Procedure for Electronic Equipment, Telcordia Technologies.
  4. Jones, J. and Hayes, J., “A Comparison of Electronic-Reliability Prediction Models,” IEEE Transactions on Reliability, Vol. 48, No. 2, June 1999, pp. 127-134.
  5. Leonard, C. T. and Pecht, M., “How Failure Prediction Methodology Affects Electronic Equipment Design,” Quality and Reliability Engineering International, Vol. 6, 1990, pp. 243-249.
  6. Denson, W., “A Tutorial: PRISM,” RAC, 3Q 1999, pp. 1-2.

About the Author

Barry Ma is a qualification engineer at Anritsu. He received a B.S. in physics and a master’s and Ph.D. in E.E. from Nanjing University. e-mail: [email protected]
Mekonen Buzuayene is a verification engineering manager at Anritsu. Previously, he worked at Plantronics and Fermi-Lab as a design engineer. Mr. Buzuayene earned a B.S.E.E. from the University of Illinois. e-mail: [email protected]
Anritsu, 490 Jarvis Dr., Morgan Hill, CA 95037, 408-778-2000.

Return to EE Home Page

Published by EE-Evaluation Engineering
All contents © 2000 Nelson Publishing Inc.
No reprint, distribution, or reuse in any medium is permitted
without the express written consent of the publisher.

November 2000

Comments

To join the conversation, and become an exclusive member of Electronic Design, create an account today!