Designing Tests Using Simulation Part 1

Historically, product acceptance testing has stopped at specification compliance testing. While quality and safety considerations suggest that more testing may be required, product cost drives testing toward a minimal solution.

The ubiquitous desktop computer has driven down the cost of design and test. It now is possible to use widely available electronic design automation (EDA) tools to provide the additional tests at a reasonable cost. In the first of a two-part series, we will discuss how it is possible to apply long-established techniques developed for the aerospace industry to define faults and sequence measurements to produce an optimal test sequence using existing EDA technology.

But before we get into the specifics of EDA, let’s briefly review the basics of product design test. The goals are to:

Demonstrate product-specification compliance.

Detect failed parts.

Sequence tests to avoid hazards.

Isolate faults down to replaceable parts.

Analyze failure-mode effects.

Analog Test Assumptions

What is a failure?

A good design can accept part tolerances that are far wider than the tolerance of the individual components. For example, a 2-kW pull-up resistor may work just as well as the specified 1-kW resistor. We want to accept the out-of-tolerance part to take advantage of the increased yield that a robust design provides. In the example, an open pull-up resistor could pass a test that is based upon functional requirements and fail in the next higher assembly when the noise environment increases.

We want to detect and reject products that contain catastrophic component failures, but accept products that may contain parametric failures that do not affect functional performance. These parametric failures are part of the tolerance distribution of the parts that we are using. Monte Carlo analysis will show the robustness of the design when we compare the resulting performance predictions with the product specification.

Defining Failure Modes

Component failures are characterized by a finite number of catastrophic failure modes. For digital circuits, there are two predominent failure modes: the outputs may be stuck at logic one or stuck at logic zero. Film and composition resistors are characterized by open-circuit failures.

You may define failure modes using common-sense experience, historical data, or physical analysis. Depending on your application, you might want to consider additional failures; for example, IC bridging or interconnect failures which result in shorts between devices. In a PCB design, you may wish to simulate high-resistance fingerprint shorts that are caused by improper handling of sensitive components.

Abstracting subassembly failure modes to the next higher assembly is a common error. For example, consider the case of an IC op-amp. The failures detected and rejected by the op-amp foundry primarily are caused by silicon defects. Once they have been eliminated, these defects will not reappear.

At the next higher assembly, failures may be caused by static electricity, operator error in component testing, or environmental stress in manufacturing. As a result, a new set of failure modes at the next production level is required. Again, we usually can describe these failure modes in terms of catastrophic events at the device or assembly interface; for example, open, short, or stuck failure modes for each interface connection.

Detecting Process Faults

Process faults could cause the shift of many parameters simultaneously. When this is a consideration, as usually is the case for ICs, the process parameters are monitored separately. If the process fails, the unit is rejected before the acceptance test is performed. Consequently, acceptance testing does not need to account for multiple parametric failure modes.

Using Simulation

Test design demands a large data base or catalog of faulty circuit behavior. The only practical method of gathering the data in a reasonable time frame is via simulation. The process of using a circuit simulator is similar to building a circuit prototype.

Do not expect to throw it together and have it work the first time. First you need to have models for all of the parts in the circuit. Most models will be readily available in libraries that are included in your simulation tools. If you need a new model, make it simple.

It is easier to add complexity to a working simulation than it is to debug faulty models while the circuit topology is incorrect. Then build the circuit slowly. Once the simulation works, it helps to compare operation with a real product. If you have that option, then the models can be refined to improve the fidelity of the simulation.

Setting Test Limits

Test limits are established using information from the product specification, simulation, analysis, laboratory testing, and instrumentation capability. When parts fail or age, the measurement value for a given test may be near the test limit. Variations in circuit parameters, environment, and test equipment can migrate these results across the test limit, invalidating the test conclusion. Figure 1 shows why this happens when failed parts are included in the UUT.

By setting test limits suitably wide, the movement from pass to fail can be eliminated. In most cases, there is not a product requirement that sets each test limit. The limits are determined via analysis, simulation, or lab testing. Here are a few tricks we use:

Include a liberal input voltage tolerance for power sources.

Perform a Monte Carlo analysis; include external tolerances.

Vary ambient temperature ±20°C.

Then expand the measurement tolerances so that the UUT passes all tests with a liberal margin (5 sigma or higher). Remember, any lack of knowledge about component models, test equipment, or environment always will require a larger test limit. Increasing the limit is your way of accounting for unforeseen conditions.

Set Alarm Conditions

Another kind of simulation limit is the stress limit or alarm condition that is offered by many EDA tools. When a part fails, it frequently is possible to overstress other parts or cause an undesired circuit condition (Figure 2).

Set power supply and current limits to the manufacturer’s specifications before part-failure-mode simulations are performed. In many instances, there are product safety issues, such as firing an airbag, that must be prevented. Since these conditions are caused by special circuit states, you must devise measurements to catch these problems and group them with the stress alarms.

Resolve Alarms

Part failure modes that cause stress alarms should be detected as soon as possible. Tests can be sequenced in a manner that reveals these destructive failure modes early on, avoiding the possibility of damage to any other circuit parts. Safe-to-start tests can reveal potentially damaging failure modes; for example, shorts from power nodes to ground.

Organize a Test Sequence

Some test ordering has been described, based on resolving stress alarms and simulation failures. The next level of ordering is by test difficulty or cost. Perform the simple and inexpensive tests first. Group together the tests that use the same setup. Perform the tests that validate product performance after you eliminate the overstress failure modes.

Do not use product performance tests for fault detection because the tolerances probably were not set with failure modes in mind. They should only lead to a pass or out-of-tolerance conclusion.

Build a Test Fault Tree

Tests are used to detect faults using the logic that is illustrated in Figure 3. Each test has one input and two outputs. The input contains a list of failure modes, and the test performs the logic that is necessary to classify the outcome as pass or fail. Each outcome has a list of failure modes that can be passed on to successive tests. The process of selecting the best test in each ordered group results in a binary fault tree.

After selecting a test, successive tests are conducted on the pass and fail nodes of the tree until no more useful information is gained. When a new test configuration is selected, try to expand the tree from each node that has more than one fault. Part 2 will develop these concepts in detail.

Program the ATE

Finally, we get to the ATE. What we have so far is a logical structure that we can use to write the test-program software. The simulation tests must be mapped into the ATE suite. Hopefully, the tests were selected because they have a reasonable mapping. The final step is to run the test using the ATE and a live UUT.

About the Author

Lawrence Meares founded Intusoft in 1985 and currently leads the software development teams responsible for the Magnetics Designer and Test Designer products. He has a patent pending for his work on the ICAP/4 schematic capture program. Previously, Mr. Meares worked at McDonnell Douglas as a manager for a circuit design team and as the principal investigator for custom IC research. He received a B.S.E.E. degree at the University of Santa Clara. His graduate studies were in electrical engineering at Seattle University and UCLA. Intusoft, P.O. Box 710, San Pedro, CA 90731, (310) 833-0710.

EE—Test Software—May

Figure 1

Copyright 1998 Nelson Publishing Inc.

May 1998

Sponsored Recommendations

Comments

To join the conversation, and become an exclusive member of Electronic Design, create an account today!