What's All This Statistical Stuff, Anyhow?

I've always been a fan of Mark Twain and his writing. He had a rather good perception of the American people, and many topics that he wrote about are fascinating to this day. One of my favorite quotes of Twain is: "There are three kinds of lies: there are "lies," there are damned lies, and there are STATISTICS...."

One thing that doesn't help me a darned bit is "statistics," at least in the sense that most mathematicians and engineers use them. I find most statistical analyses worse than useless. But I do like to use charts and graphs. I took some data of diodes' V_F versus I_F recently. The data was a little suspicious when I wrote down the numbers, but after I plotted the data, I knew there was something wrong. Then I just went back and took more data until I understood what the error was, ac current noise that was being pumped out of the inputs of the digital voltmeter, crashing into the diode, and causing rectification. If data arises from a well-behaved phenomenon and conforms to a nice Gaussian distribution, then I don't care if people use their statistical analyses-it may not do a lot of harm. (Personally I think it does harm, because when you use the computer and rely on it like a crutch, you get used to believing it and trusting it without thinking....) However, when the data gets screwy, classical statistical analysis is worse than useless.

For example, one time a test engineer came to me with a big formal report. Of course, it didn't help that it arrived at 1:04 P.M. for a Production Release Meeting that was supposed to start at 1:00 P.M. But this was not just any hand-scrawled report. It was handsome, neat, and computerized; it looked professional and compelling. The test engineer quoted many statistical items to show that his test system and statistical software were great (even if the ICs weren't). Finally he turned to the last page and explained that, according to the statistics, the ICs' outputs were completely incompetent and way out of spec. Thus, the part could not be released.

In fact, he observed, the median output of the output was 9 V, which was pretty absurd for the logical output of an LM1525-type switching regulator, which could only go to the Low level of 0.2 V or the High level of 18.4 V. How could the outputs have a median level of 9 V?

How do you get an R-S flip-flop to hang up at an output level half-way between the rails? Unlikely.... Then he pointed out some other statistics-the 3 sigma values of the output were +30 V and -8 V. Now, that's pretty bizarre for a circuit that only has a +20-V supply and ground, (and it isn't running as a switching regulator, it's just sitting there at dc). The meeting broke up before I could find the facts and protest, so that product wasn't released on schedule.

It turns out, of course, that the tester was running falsely. So while the outputs were all supposed to be set to +18.4 V, they were actually in a random state. Half of the time the outputs might be at 18.4 V and half of the time at 0.2 V. If you feed this data into a statistical program, it might indeed tell you that a lot of the outputs would be at +9 V, and some of the outputs might be at -8 V, assuming that the data came from a Gaussian distribution. But if you look at the data and think, it's obvious that the data came from a ridiculous situation. Rather than ramming the data into a statistical format, the engineer should have checked his tester.

Unfortunately, this engineer had so much confidence in his statistical program that he spent a whole week preparing the Beautiful Report. Did he inform the design engineer that there were some problems? No. Did he check his data, check the tester? No. He just kept his computer cranking along, because he knew the computer analysis was the most important thing.

We finally fixed the tester and got the product out a little late, but obviously I wasn't a fan of that test engineer (nor his statistics) as long as he was at our company. And that's just one of a number of examples I trot out when anybody tries to use statistics that are inappropriate.

I do like to use scatter plots in two dimensions to help me look for trends, and to look for "sports" that run against the trend. I don't look at lots of data on good parts or good runs, but I study the heck out of bad parts and bad runs. And when I work with other test engineers who have computer programs that facilitate these plots, I support and encourage those guys to use those programs, and to look at their data, and to think about those data. I support anything that facilitates thinking.

A couple years ago, I was approached by an engineer who was trying to use one of our good voltage references with a typical characteristic of about 20 ppm per 1000 hours long-term stability at +125°C. He was using it around room temperature, and was furious because he expected it to drift about 0.1 ppm per 1000 hours at room temp, and it was a lot worse than that. He asked why our reference was no good. I pointed out that amplifiers' drifts and references' drifts do not keep improving by a factor of 2 every time you cool them off another 11 degrees more.

I'm not sure who led him to believe that, but in general, modern electronic components aren't greatly improved by cooling or the absence of heating. In fact, those of us who remember the old vacuum-tube days remember that a good scope or voltmeter usually worked better if you kept it running nice and warm all the time, because all of the resistors and components stayed dry and never got moist under humid conditions. I won't say that the electrolytic capacitors might not have liked being a little cooler. But the mindless effort to improve the reliability by keeping components as cool as possible has been overdone. I'm sure you can blame much of that foolishness on MIL-HBDK-217 and all its versions. In some businesses, you have to conform to -217, no matter how silly it is, but in the industrial and instrument business, we don't really have to follow its every silly quirk and whim.

One guy who argues strenuously about -217 is Charles Leonard of Boeing, and you may well enjoy his writing (Leonard, Charles, "Is reliability prediction methodology for the birds?," Power Conversion and Intelligent Motion," November 1988, p. 4). So if something is drifting a little and you think you can make a big improvement by adding a fan and knocking its temperature down from +75 to +55°C, I caution you that you'll probably be disappointed because there usually isn't a lot of improvement to be had. It's conceivable that if you have a bad thermal pattern causing lots of gradients and convection, you can cut down that kind of thermal problem. In general, though, there's not much to be gained unless parts are getting up near their maximum rated temperature or above +100°C. Even plastic parts can be pretty reliable at +100°C. I know the ones I'm familiar with are.

(This column is an excerpt from the soon-to-be-published book I have written entitled "Troubleshooting Analog Circuits." This endeavor will be published by Butterworths in April 1991.)

All for now./Comments invited!/RAP/Robert A. Pease/Engineer

ADDRESS: Mail Stop C2500A, National Semiconductor, P.O. Box 58090, Santa Clara, CA 95052-8090