An industry expert shares some common sense about a very simple approach to product improvement and cost reduction.
After reading several recent technical articles on highly accelerated life testing (HALT) and highly accelerated stress screening (HASS), I noted that there seemed to be many incorrect conceptions floating around. One article talked about HALT as an extreme over-design activity that eventually would sink a company due to costs from continuing improvements without end. This end result is one that, indeed, could occur if those performing HALT had no knowledge of correct procedures and concepts. The crux here is correct procedures and concepts.
Some Background Information
To help you reach the correct conclusions on where to stop, some background information must be covered.1,2 The principal idea in HALT is to find the weak links as quickly as possible and then fix them. After improving one weak link, the next weak link then is found and improved and so on until there are no weak links left that would cause field failures or increase the costs of HASS.
One concept that is essential to understand is the crossover effect illustrated in Figure 1. Consider that two different stresses are used in HALT, and let us arbitrarily pick temperature and vibration, two of the most frequently used and most important stresses.
The horizontal axis has two stresses on it. The two different stress axes can be moved left and right and stretched so that the field stress levels at which failures occur are as shown on the left-hand side and the stress levels at which failures occur during HALT are shown on the right-hand side of the figure. The vertical axis is the number of failures at any given stress level.
Hundreds or thousands of stress cycles occur every second during vibration, and only a few thermal cycles can be completed in an hour. So according to Miner's Criteria, the slope of the vibration-induced line will be greater than the thermally induced failure line.
As a result, you can find a given weakness using vibration in HALT whereas, in the real world, temperature would cause the same failures to occur. This crossover effect is one of the reasons for reacting to weaknesses found by stresses other than those naturally occurring.
A second way to consider the crossover effect uses Venn diagrams of temperature and vibration overlap (Figure 2). In the area where the diagrams overlap, either temperature or vibration can bring out the weakness so it frequently shows up sooner in vibration HALT than in temperature HALT. Combined stresses are even better as stresses generally add up to a higher stress but not necessarily linearly-only in a Mohr's circle sense.
When stresses above the field environment are used, things happen much faster than at normal stress levels as described by several failure modes. A generic equation that covers fatigue damage due to temperature cycling and vibration is
where: D = the fatigue damage accumulated
n = the number of cycles of stress
σ = the mechanical stress (in pounds per square inch, for example)
• = an exponent derived from the S-N diagram for the material.
ranges from 8 to 12 for most materials in high cycle fatigue (low
stress and many cycles to failure).
This equation shows that time compression is exponential with the stress level for mechanical fatigue damage. The same is true for many other stresses. Consequently, we increase the stresses as much as possible in HALT to compress the time required to find weak links, hence the name time compression.1,2
Discussions on the crossover effect and time compression show why it is incorrect to focus on the stress type, the stress level, or the margin above the field levels when deciding whether or not to fix a particular weakness. Lack of knowledge of this fact is a major contributor to unsuccessful HALT programs.
Where to Stop the Improvements
The question of how to determine where to terminate the accelerated testing performed during HALT is asked frequently. The answer is not short, but it is simple; however, it does require some engineering skills and logic to implement. HALT is discovery testing where we intentionally stress the product to failure to find its weak links.
It is critical to remember that the failure modes and mechanisms are extremely important, and the stress types, the stress levels, and the margins regarding the stress used to find the weak links are totally unimportant1,2 and not even considered in a rational HALT program. The reasons: The crossover effect1 may be present, and we also are using time compression techniques1 in HALT.
Stresses that may not even occur in the real world for the product under consideration are used in HALT to get results rapidly. This is why the crossover effect must be recognized. Stresses over the normal field stresses are used to make things happen faster, which is time compression.
There are only two relevant considerations present when a weakness is discovered:
1. Will the weakness in the overstress conditions during HALT affect the field reliability of the product, and if so, will an improvement be cost-effective?
2. Will improvement of the weakness decrease the costs of HASS in an effective manner; that is, to result in lower overall cost?
One article I read made no reference to these concepts. It seemed to suggest that you could just keep making the product more and more robust until finally the company went out of business because of costs associated with continued product improvement to way beyond what was prudent.
This approach, of course, is not correct. We need to apply our skills and knowledge to make reasonable decisions regarding continued product improvement and limit improvements to those that will contribute to increased field reliability considering the cost/benefit ratio.
Testing over the field environments is not a guarantee of finding only irrelevant failures. We will indeed find many relevant failure modes when testing above the expected field environments. We can test to unreasonable levels and find some irrelevant failure modes. An unreasonable level would be a temperature above solder reflow, for example.
Every weakness found should be evaluated under the stated conditions. That is, will the identified weakness cause field failures or HASS cost increases above what a fix would cost? If the answer to the question is no, then no permanent fix is needed.
However, a temporary fix should be made to explore for other relevant failures that may occur at higher stress levels due to differing time-compression factors for various modes as well as to the crossover effect. All populations have distributions, that is, all members of a failure-mode set do not fail at exactly the same stress level or duration.1,2
The correct way to decide about whether or not to improve is to analyze or test for the answer to the question Will this defect cause a failure in the product or increase HASS costs?• This analysis may require finite element, Spice, or other modeling to determine the answer, and that answer is very dependent on the time history of the appropriate stress so the fatigue damage calculations will be reasonably accurate.
Frequently, it is more cost-effective to make an improvement instead of spending a lot of money on analyses that may not be very accurate. I prefer to simply fix weaknesses and then analyze the expenses.
After gaining some experience, you will know if it is necessary to fix the weaknesses found. Comparison of the field failures of similar products will help in determining if improvements really are necessary.
Remember that the entire population will have a spread, or distribution, of stresses over which failures occur. For that reason, it makes sense to continue to fix or at least band-aid a fix until we arrive at the level where only irrelevant failure modes are being discovered.
Experience, education, and logic all assist in these decisions, and I do not know of any magic procedure for determining the levels above which improvements are not beneficial. However, it is obvious that to continue to improve is not an appropriate path because you will certainly surpass any reasonable level, and costs will just increase until the company goes broke making irrelevant improvements.
A major mistake is finding but not fixing a weakness because the stress used in discovery was more intense than the field level or the stress used in discovery did not exist in the field at all. Time compression and the crossover effect usually make this conclusion incorrect, although the conclusion could be correct in some limited instances.
Another fact comes into play in the decision-making process: The typical HALT shakers have spectra that are anything but straight lines.1,2 Increasing the vibration levels may result in a lower stress level within the product due to the following:
1. The spectra may drop off at a particular frequency where the product has a resonance as the overall vibration level is increased and the line spectra generated by the impacts of the vibrators move to the right.
2. The product may have pronounced nonlinear vibration behavior, which is typical of electronic assemblies. Consequently, higher overall vibration levels may lead to lower internal stress and vice versa.
These phenomena, discussed at length in references 1 and 2, certainly are not obvious to those unskilled in vibration analysis of electronic equipment or unfamiliar with Fourier analysis of repeated impacts as generated by the repeated impact HALT shakers.
Many companies have attempted HALT and HASS using incorrect approaches and experienced major disasters. Often, this resulted in the company stopping all HALT and HASS activities and selling their stimulation equipment. Success only requires a little logic and engineering skill to accomplish substantial returns on investment.
When deciding whether or not to improve any weak links found during HALT, you must apply engineering skills and logic to determine if the weaknesses found will cause field failures or hold back the stress levels that can be applied in HASS and consequently increase the cost of HASS. If so, then the weakness should be improved.
It is never a good idea to just keep finding and fixing weak links with no intended end because doing so will not allow you to reach the goals of HALT:
• Decrease the costs of design.
• Achieve earlier mature product release to production.
• Decrease the costs of HASS.
• Reduce the costs of warranty.
• Increase the field reliability.
• Increase market share.
• Increase profits.
In correctly performed HALT and HASS programs, many companies have saved hundreds of millions of dollars. The old adage Do it correctly or not at all!• seems to be appropriate advice to those attempting to make the paradigm shift to HALT.
1. Comprehensive HALT and HASS Seminar, www.hobbsengr.com
2. HALT and HASS, Accelerated Reliability, Hobbs Engineering.
Time Compression is a trademark of Hobbs Engineering, and Time Compression Systems is a trademark of HALT&HASS Systems.
About the Author
Gregg K. Hobbs, Ph.D., P.E., is president of Hobbs Engineering and HALT&HASS Systems. He has been a consulting engineer since 1978, specializing in the fields of ruggedized design and stress screening. Dr. Hobbs has taught and consulted in the United States, Canada, Mexico, Africa, Australia, New Zealand, Europe, Asia, and the Middle East and authored the book HALT & HASS, Accelerated Reliability Engineering. He has 13 patents as sole inventor and several more pending on equipment to perform HALT and HASS. Hobbs Engineering, 4300 W. 100th Ave., Westminster, CO 80031, 303-465-5988, e-mail: [email protected]
FOR MORE INFORMATION
on HALT and HASS processes