Designing Tests Using Simulation Part 2

Today, it is possible to use widely available electronic design automation (EDA) tools to provide comprehensive testing at a reasonable cost. Part 1 in the May issue of EE described the use of simulation for designing test software. Part 2 discusses the details of test synthesis, presenting a method based on standards developed for the aerospace industry. To fully comprehend this synthesis procedure, you first need an understanding of several concepts.

Ambiguity Group

An ambiguity group simply is a list of failure modes. Since the failure mode references a part, all part properties are included. The most important for this discussion is the failure weight.

Failure Weight

The failure weight of a part failure mode is an arbitrary number proportional to the failure rate. It is set to 1.0 for each default failure mode. The failure weight will be used to grade a test. Using these weights, each ambiguity group can be assigned a probability that is the sum of all failure weights. A selection procedure will be developed that favors tests that split these cumulative weights equally between their pass and fail outcomes.

No-Fault Weight

When we begin a test, there exists an input ambiguity group that contains all of the failure modes we will consider: the fault universe for the UUT. It is useful to add a dummy failure—the no-fault probability—to this universe.

The no-fault probability will change depending on where in the product life cycle the test is being performed. An initial production test should have a high yield so we use a low no-fault weight. If the product is returned from the field, it is expected to be faulty so that the no-fault weight is low.

For built-in tests, the no-fault weight depends on the accumulated failure rate. It will change the grade for tests in the “go” line; that is, the product-acceptance test list. The no-fault weight will tend to reduce the number of tests needed to arrive at the expected conclusion.

Test Definition

A test consists of a comparison of a resultant value against limits that classify the UUT as good (pass) or bad (fail). The resultant value can be the combination of one or more measurements and calculations. Each test has an associated ambiguity group.

At the beginning of the test, there is a set of failure modes that has not been tested. These are defined by the Input Ambiguity Group. The test itself is capable of detecting a number of failure modes. These modes are grouped into a set called the Test Ambiguity Group. The pass and fail outcomes then carry a set of failure modes that are the result of logical operations carried out between the Input Ambiguity Group and the Test Ambiguity Group such that:

Fail Ambiguity Group = Input Ambiguity Group AND Test Ambiguity Group

Pass Ambiguity Group = Input Ambiguity Group MINUS Test Ambiguity Group

where: AND represents the intersection of the groups, and MINUS removes the elements of one group from the other.

Using MINUS here is a convenient way of avoiding the definition of NOT (Input…), since we really are not interested in a universe that is greater than the union of the Input and Test Groups. Figure 1 illustrates this logic.

Then the fail or pass outcomes can be the input for further tests. If additional tests are only connected to the pass output, then a product acceptance test is created. If tests connect to both the pass and fail outcomes, then a fault tree is created which isolates faults for failed products.

In either case, the object of each test is to reduce the size of the output ambiguity groups. When no tests can be selected to further reduce these ambiguity- group sizes, the test design will be complete.

Test Strategy

The strategy used to select a test and the test sequences targets identification of the smallest ambiguity group—measured by the number of parts—

using the least number of tests. The least number of tests actually is the smallest mean number of tests to reach the terminal nodes of the diagnostic fault tree. The fault tree is made by interconnecting the tests illustrated in Figure 1 in a binary tree.

Selecting the Best Test

A terminal conclusion is defined as the pass or fail conclusion for which no more tests can be found. If the no-fault mode is present, it is the product-acceptance test result, with all remaining faults being undetectable. Otherwise, the parts with the resulting failure modes could be replaced to repair the UUT.

In general, our goal is to reach the terminal pass—fail conclusions performing the fewest numbers of tests to reach each conclusion. Several heuristic approaches are possible, one of which follows.

If we take the idea of failure modes one step further, we can give each mode a weight that is proportional to the failure rate. To avoid looking up failure rates, we can default these weights to 1.0 and some time later fill in a more precise number.

For each test candidate, we can compute the probability of a pass outcome and a fail outcome. From a local point of view, the summation of the pass and fail probabilities must be unity; that is, the UUT either passes or fails a particular test. We can use applicable probability values and entropy calculations to find the best tests. Entrophy, as defined in information theory, is defined as a measure of the amount of information in a message that is based on the logarithm of the number of possible equivalent messages.

Test entropy is expressed as:

Entropy = -q*log(q) – p*log(p)

where: p and q are the pass and fail probabilities.

p = S pass weights/(S pass weights + S fail weights)

q = S fail weights/(S pass weights + S fail weights)

The highest entropy test contains the most information. We select the best test as the test having the highest entropy. For the case when failure weights are defaulted to unity, this method will tend to divide the number of input failures into two equal groups. Since no-fault can only be in the pass group, a high no-fault weight will steer the tests through the pass leg fastest, making the best product acceptance test.

On the other hand, if we want to test a faulty product, we would give the no-fault probability a lower value. Then the test tree would be different, having a tendency to isolate faults with fewer tests.

Test Sequencing

The definition of the best test did not include the difficulty of setting up or performing the test, nor did it include the potential of a failed part that could destroy other parts. In addition, the tests were selected independently of the product specification so we could get to the terminal pass outcome with a tolerance failure.

To overcome these problems, tests are sequenced. The sequence priority is:

1. Perform tests that eliminate failure modes that could be destructive in future tests.

2. Perform easy tests first such as DC measurements at room temperature.

3. Perform tests using the same environment next such as rise time.

4. Repeat the previous steps for major setup changes, such as high temperature.

Robust Tests

Tolerances can cause measurement results to migrate across the test-limit boundary. As a result, a fault could be classified as good or a good part could be classified as a failure. The tolerances include:

UUT part tolerances.

Computer model accuracy.

Measurement tolerances.

UUT noise.

Test-set noise.

Fault-prediction accuracy.

In Part 1, we showed that avoiding false test conclusions for tolerance failures requires setting the test limits as wide as possible. Now we will show how to set the limits as far away from the expected failed results as possible. We will do this by a unique test-selection procedure. But first, to compare one test with another, we need to define a measure of test robustness.

A test measurement with a range of values that defines acceptable performance is called a tolerance band. The measure of test robustness with respect to a failure mode is the distance between the failed measurement result and the nearest test limit divided by the tolerance band. We call this value the guard band.

The test limit can be safely placed in the guard band so long as no other faults fall in this band. Normalizing all measurements using their tolerance band allows us to compare guard bands of different tests. Then, we can modify the entropy selection method to reject tests with small guard bands. Figure 2 shows how this works.

In this example, we have two operating point tests. Failure modes are identified as No-Fault, F1, F2, …F6. Assume test A is performed first and test B is performed on the pass group of test A as shown in Figure 3. Test A divides the failures into a pass group containing F4, F5, and F6 and a fail group containing F1, F2, and F3.

Connecting test B to the test-A pass outcome eliminates F1 from the test-B failure input, and the guard band for test B extends from the high limit to F4. If test B were performed first, the guard band would be smaller, from the test B high limit to F1.

An incorrect test outcome will invalidate subsequent test decisions. To be right most often, tests with large guard bands should be performed first because they are less likely to be wrong. Moreover, tests that previously were rejected may be excellent tests later in the sequence. Tests with small guard bands simply should not be used.

While a model of the statistical distribution was shown in Figure 2, usually there is not sufficient information to have a complete knowledge of the statistics of a measurement result. In particular, the mean frequently is offset because of model errors; for example, a circuit that is operating from different power supply voltages than were expected.

The statistics of failed measurements are even less certain because the failure mode and the failed circuit states are less accurately predicted. As a result, it is necessary to increase the tolerance band as much as possible. We avoided saying exactly where in the guard band the measurement limit should be placed. It is a judgment call, depending on how much the tolerance band was widened and the quality of the fault predication.

Final Remarks

EDA software has put a new face on test design. Emerging EDA technology now provides a solution that not only is better, but also faster and cost-effective. Benefits in production quality, consumer safety, and faster design times make a compelling argument to consider this approach.

About the Author

Lawrence Meares founded Intusoft in 1985 and currently leads the software development teams responsible for the Magnetics Designer and Test Designer products. He has a software patent pending for his work on the ICAP/4 schematic capture program. Previously, Mr. Meares worked at McDonnell Douglas as a manager for a circuit design team and as the principal investigator for custom IC research. He received a B.S.E.E. degree at the University of Santa Clara and completed graduate studies in electrical engineering at Seattle University and UCLA. Intusoft, P.O. Box 710, San Pedro, CA 90731, (310) 833-0710.

June 1998