JPL statistician comments on big data, cost and reliability of results

Feb. 10, 2015

Amy Braverman, principal statistician at the Jet Propulsion Laboratory, develops strategies to analyze information from NASA’s space-borne instruments. In an interview with Scott Thrum, senior deputy technology editor at The Wall Street Journal, she comments on dealing with imperfect data or data that’s not perfectly arranged in rows and columns.

“Data collection is so different today than it used to be,” she says, pointing out that spacecraft collect information on thousands of variables, freeways have built-in sensors, and supermarket scanners fill databases with information on purchases. The “opportunistic” data doesn’t lend itself to analysis using traditional statistical and data-mining technologies, so your existing software is unlikely to work.

Data is also distributed, she said. Distributed computing is well known—you divide a problem and farm out the pieces to multiple computers. Distributed data presents different challenges—you may be trying to calculate correlation coefficients between a column of data in New York and one in Los Angeles. You can move the data back and forth, she said, but that might get expensive. Or you can operate on summaries of the data, but that could compromise the accuracy of your result. There is a tradeoff between cost and the reliability of your conclusion, she said.

You also need to know what the data you’ve captured means. From her own work, she noted that polar-orbiting satellites monitoring for CO2 concentrations might cross the equator at 1:30 p.m.—that’s a time when plants are photosynthesizing, so it would be a mistake to base global CO2 distributions for all times of day on that data. “You have to be aware of the biases that may be imparted to the data that you have, relative to the data you wish you had,” she said.

It’s important, she said, to really think hard from first principals and build new statistical tools.

WSJ subscribers can see an excerpt of the interview here.—Rick Nelson

See related posts:

About the Author

Rick Nelson | Contributing Editor

Rick is currently Contributing Technical Editor. He was Executive Editor for EE in 2011-2018. Previously he served on several publications, including EDN and Vision Systems Design, and has received awards for signed editorials from the American Society of Business Publication Editors. He began as a design engineer at General Electric and Litton Industries and earned a BSEE degree from Penn State.

Sponsored Recommendations

Comments

To join the conversation, and become an exclusive member of Electronic Design, create an account today!