Using AI for Real-Time Engineering Decisions

MathWorks’ Heather Gorr explains how engineers can apply artificial intelligence to real-time engineering decisions, as well as the issues involving data synchronization.

William G. Wong

Related To:

Electronic Design

Feb. 3, 2022

5 min read

What you'll learn:

How to apply AI models to real-time sensor datasets.
The all-important issue of data synchronization and the best methods to achieve that goal.
A look at streaming environment models.

AI is everywhere. Specifically, AI models are now integrated into a diverse set of applications, including automated vehicles, manufacturing equipment, and medical devices. These applications, among others, now use dozens, hundreds, or even thousands of sensors to gather real-time performance data. With these sensors continuously generating hardware data at every moment, producing insights at an equally constant speed can be challenging unless engineers are deploying machine-learning-powered models.

Having ready access to such a wide pool of sensor data can feel overwhelming. However, for engineers, multiple techniques exist where they can apply AI models to real-time sensor datasets to help them better prepare and extract insights from the data, as Heather Gorr, Senior Product Manager of Data Science for MathWorks discusses with Electronic Design’s Bill Wong.

Heather Gorr, Senior MATLAB Product Manager, MathWorks

Why is it a challenge to apply machine learning to real-time sensor data?

The greatest challenge is data synchronization. It can be difficult even when not dealing with high-frequency, time-sensitive data. For real-time sensor data, each sensor has a slightly different sampling rate, or time step, that must be synchronized into a single streaming dataset with identical times for analysis. It’s hard to know where to start.

How can engineers ensure a real-time sensor dataset is synchronized?

At the heart, data synchronization is about choosing how to best align data points wherever the time steps don’t match, whether it’s through aggregation, interpolation, or simply using averages to fill in missing data. This ensures that time steps are in sync while matching the original dataset closely enough to remain usable. The choice of which method to use depends on factors such as time vector alignment and application requirements.

The first step, particularly if the time alignment between datasets is uncertain, should be to fill in the gaps with missing data, like an outer join or constant value. Exploring and visualizing the resulting data—including time steps and missing data points—will help engineers determine how to proceed.

What is the best method for synchronizing sensor data?

To reiterate, data synchronization is mostly about deciding how to fill in mismatched data points. Therefore, the most common method for this is through interpolation of sensor data. Since the gaps in sensor datasets when interpolation is used are minimal, engineers tend to have existing knowledge or insights into what trends are driving these datasets. More specifically, linear interpolation is especially common, as it’s easy to understand.

However, interpolation becomes less precise if the points are farther away, in which case the better solution is a polynomial or spline interpolant. To retain more of the trend, it’s common to use shape-preserving piecewise cubic (“pchip”) or Akima piecewise cubic Hermite interpolants. Keep in mind that for these interpolation methods to work, the data must be monotonically increasing (sorted, with evenly spaced timing).

These methods are commonly built into the APIs and modules of mainstream data science platforms because of their popularity and effectiveness.

What about streaming environment models, where predictions must be made and reported continuously?

The first step is to collaborate with the engineering team on planning the system. It’s important to establish available system requirements and parameters such as the time window, which controls how much data enters the system for prediction purposes, before building anything. A full streaming prototype (second step) should also be built as early as possible; the algorithms themselves can be fine-tuned later.

Multiple resources exist for comparing algorithms, but the model chosen for streaming data should be one that’s well-suited for time series and forecasting. Potential models include traditional time-series (GARCH, ARIMA, curve fitting), machine learning (SVMs, nonlinear trees, Gaussian processes), and deep learning (LSTMs, CNNs, TCNs, multilayer perceptron).

While all of these models can work, several key aspects should be considered first when working with streaming data. Typically, a streaming dataset is analyzed one second at a time or less, so any algorithms used on them must be compatible. The algorithms chosen should also be capable of receiving updates in real-time, incorporating new data without losing historical data. Any predictions generated by the model should be equally fast and easily distributed.

How do you prepare streaming data for a machine-learning model in particular?

As mentioned, analyzing streaming data requires planning. It’s helpful and important to capture data types, time-window requirements, and other expectations throughout the development process. Standard software practices like documentation, source control, and unit testing also help facilitate streaming data preparation. You need failure data, too, which can be simulated to predict failures.

Since data passes through the stream only one second at a time, it’s also important that the model analyze as much information with as little noise as possible. It’s common to use frequency-domain tools such as the FFT and power spectrum. Caching the model is another helpful method of maintaining the low latency needed in these systems.

Challenging as it can be to apply AI models to streaming applications, tools like MATLAB and Apache Kafka can be used to help integrate the data preparation and AI modeling stages into the streaming architecture, making it easier to execute.

Heather Gorr is the Senior Product Manager, Data Science, for MathWorks' MATLAB platform, where she focuses on data analytics, preprocessing, structures, mathematics, and big data.

About the Author

William G. Wong

Senior Content Director - Electronic Design and Microwaves & RF

I am Editor of Electronic Design focusing on embedded, software, and systems. As Senior Content Director, I also manage Microwaves & RF and I work with a great team of editors to provide engineers, programmers, developers and technical managers with interesting and useful articles and videos on a regular basis. Check out our free newsletters to see the latest content.

You can send press releases for new products for possible coverage on the website. I am also interested in receiving contributed articles for publishing on our website. Use our template and send to me along with a signed release form.

Check out my blog, AltEmbedded on Electronic Design, as well as his latest articles on this site that are listed below.

You can visit my social media via these links:

I earned a Bachelor of Electrical Engineering at the Georgia Institute of Technology and a Masters in Computer Science from Rutgers University. I still do a bit of programming using everything from C and C++ to Rust and Ada/SPARK. I do a bit of PHP programming for Drupal websites. I have posted a few Drupal modules.

I still get a hand on software and electronic hardware. Some of this can be found on our Kit Close-Up video series. You can also see me on many of our TechXchange Talk videos. I am interested in a range of projects from robotics to artificial intelligence.