Know Your Benchmarks Before You Make Comparisons

A Polk survey has touted that “only” 35% of hybrid car owners would buy another. Unfortunately, the details of the survey are not available, but Polk is in the business of selling information so tidbits get tossed around.

My wife and I own hybrids. My wife likes her Honda Civic hybrid, while I have a Toyota Prius (see the figure). She likes the smaller size, and I like the ability to run all-electric. It’s handy when I’m sitting in traffic or driving around the neighborhood. It’s one of the smoothest cars I have ever driven.

The 2012 Toyota Prius looks slick but I’ll probably have my Winter Gray 2010 model for a few more years yet. At that point, another hybrid or electric will be the replacement.

The problem with the survey results is that it’s difficult to determine what they mean because we don’t know what questions were asked and what alternatives were provided.

Lies, Damn Lies and Statistics

Cars are always a hot topic with designers, programmers, and engineers. So are other measurement issues like benchmarks, which are a lot like surveys. The idea is to present useful numbers that allow people to make comparisons that will affect future decisions.

Many years ago when PC Magazine was still a print publication, I was the director of PC Labs, which was known for its benchmarks. This was decades ago when comparing processors, printers, and graphics cards was relatively easy. We were able to compare related processors, and processor benchmark results were often a function of clock rate.

Benchmark results and comparisons are useful when it’s possible to extrapolate from these values. It helps to know what was tested so the extrapolation doesn’t go off to places it shouldn’t. This is the problem with statistics in general. One wrong assumption and all bets are off.

Still, benchmarks and surveys can provide useful information. The more one knows about what is being measured and why, the better.

One challenge at PC Labs was reducing the number of results while retaining useful information. Everyone wanted a single number as a result to make their job of evaluating the items being tested easier. But single numbers tend to make results misleading when a lot of variables come into play.

Embedded Benchmarks

Computer benchmarks have been around for ages, and they continue to be useful when they’re used properly. One source for the embedded market is EEMBC, originally known as the Embedded Microprocessor Benchmark Consortium.

EEMBC benchmarks are highly regarded and numerous, allowing people to examine key aspects of a system design. One of the most commonly highlighted benchmarks, the CoreMark, actually generates a single number. It’s designed for single-core processors and written in C, enabling it to be relatively portable.

CoreMark does a good job of generating comparable results when comparing two microcontrollers with the same architecture using the same compiler. Change compilers or architectures, and the comparisons become more tenuous. Of course, this doesn’t stop marketing departments from touting better CoreMark values.

Unfortunately, even the simple CoreMark tests the architecture, compiler, and a particular instance of the chip. I would consider differences of an order of magnitude important, but smaller fractions could easily be due to other factors. Still, using benchmark and survey results with care and understanding can be very useful. Likewise, the plethora of benchmarks from EEMBC can be very useful.

If you want more complex benchmarks than EEMBC’s, try Android AndEBench and the GrinderBench for Java, which are two examples of system-oriented tests. Then there are processor-oriented benchmarks like the FPBench for floating point and DENBench for digital entertainment products. There is even a power usage benchmark, EnergyBench. Still, EEMBC’s benchmarks provide consistency and lots of results.

The challenge is deciding what numbers are going to be worthwhile for you in addition to knowing what will be relevant. Unless you’re generating your own benchmarks, you will likely need to compromise on both the tests and the results.

My wife and I are empty nesters. We went through a Toyota and Honda van when the kids were at home, but we didn’t even consider buying another one when we were downsizing. An electric car may be in our future, but that will be a few years from now. We tend to drive our cars until they don’t run anymore, and the Civic and Prius have a long life ahead of them.

Benchmarks and surveys aren’t always about the answers. Sometimes they’re about asking the right questions.