Remember when you had to push a button or turn a handle to make something happen? It seems like a long time ago, but control via touch, swipe, wave, and voice really have been recent developments. Still, the trends are clear.
When I walk up to my Toyota Prius, I unlock the door by pulling the handle. The car recognizes me via the key fob in my pocket. The same technology lets me start the car by pressing a button. Our other car requires a key. I only drive it occasionally, but I usually forget to pull out the key. I do look a little silly trying to start the car by pressing a non-existent button.
- Interview: Colton Jacobs Discusses The Omni Virtual Reality Platform
- Resistive Advances Heat Up Touchscreen Wars
- 3D Gesture IC Takes Advantage Of E-Field Sensing
- How Microsoft’s PrimeSense-based Kinect Really Works
As you might guess, the Prius is the newer vehicle. Its voice-activated navigation system requires a button press on the steering wheel. But from then on, it’s all voice interaction for everything from making a call to changing a navigation point. My version has very fixed keyword or data-oriented segmented interaction, so it is not a matter of speaking naturally.
It gets the job done, though, and in some instances it’s the only way to do things. For example, some navigation features are only available via voice commands when the car is in gear. This requirement was included for safety’s sake.
These features have been standard on higher-end models for years. But each year, there have been significant improvements in functionality, performance, and reliability. The same is true for mobile devices and appliances, from tablets to washing machines.
A Touch On Glass
Mechanical buttons used to be cheap. They still are for many applications, but touch sensors are more the norm now where microcontrollers and microprocessors are involved.
Capacitive and resistive touch interfaces are built into many microprocessors. Creating the custom sensor layouts and overlays for multiple keys, sliders, or controls is as easy as making them for a single one, providing significant assembly, reliability, and cost advantages.
Gesture recognition is common on smart phones and tablets courtesy of capacitive touch support. It is possible with any touch sensing system, and some resistive systems handle multitouch (see “Resistive Advances Heat Up Touchscreen Wars” at electronicdesign.com). Gesture recognition can even be built into a chip (see “3D Gesture IC Takes Advantage Of E-Field Sensing” at electronicdesign.com).
A few gestures tend to be commonly supported such as point and press as well as dials and sliders that usually have a displayed representation on a dynamic screen or a static layout. The Apple iPhone popularized the pinch and zoom and swipe gestures. More advanced gestures, however, tend to be application- or device-specific, making them look more like magic than replacements for physical device control. This can lead to user confusion when they do not recognize the graphical representation that provides a hint to the type of interaction that it supports.
Capacitive touch technology can also support 3D sensing. The Z-axis typically has less accuracy but is more than sufficient for 3D gestures. 3D gesture sensing can be used to provide additional feedback such as highlighting a button before it is pressed in the same fashion as when a cursor hovers over a button or menu item and a help bubble appears.
Some capacitive-sensor button implementations have an air gap that provides mechanical feedback. The travel is not necessarily as great as a mechanical switch would be, but the construction is similar to conventional touch sensors with all of the controls laid out on a single layer. This still leaves a wide variety of touch sensors that could use a haptic biofeedback mechanism.
Feedback mechanisms can be divided into motorized, piezoelectric, and polymer actuators. Rotating and linear motorized systems have been very common, but some of the latest controller chips can provide sophisticated feedback mechanisms that are easy to coordinate. Piezoelectric systems can be very compact, allowing them to be employed in places where it would be difficult to place a motorized actuator.
Polymer-based systems are even more compact, so they can be used for localized feedback. Strategic Polymer’s Awake keyboard prototype (Fig. 1) implements feedback for each key using electromechanical polymer actuators (EMPs). The technology allows extremely thin systems.
The Power Of The Pen
Pen interfaces complement touch interfaces (see “The Year Of The Digital Pen” at electronicdesign.com). They can be implemented using the same technology used for finger touch recognition, although controller chips tend to specialize in stylus and multi-touch support.
- The Year Of The Digital Pen
- Sixense Sensor Provides Real 3D Positioning
- Time-Of-Flight 3D Coming To A Device Near You
Pens are more precise than fingers. This is useful for many applications including drawing. Pens can take advantage of the high accuracy of their sensing systems. They also can have buttons that improve their functionality once the user understands what the buttons can be used for (watch “N-trig IC Series Tackles Noise For Better Touch Response” at engineeringtv.com).
Keyboards, both physical and now virtual, unfortunately have reduced cursive penmanship to a dying art. The new pen interfaces probably won’t change this trend, but the interface is likely to remain useful. At this point the challenge is actually more on the application side rather than the hardware, though there is little demand for more stylish pens as there has been.
Gaming The Motion System
More mobile devices like smart phones and tablets are incorporating microelectromechanical systems (MEMS) like 3D accelerometers and gyroscopes. These components can determine their orientation so smart phones and tablets, then, can do tricks like automatically switching from portrait to landscape mode. Smart phones and tablets also then can be used as game controllers and remote control devices.
Android phones and iPhones can be used to fly Parrot’s AR.Drone electric quadrotor UAV (see “Smart Phone Controls Low-Cost Quadrotor” at electronicdesign.com). The interface takes advantage of these sensors as well as the touchscreen, which also displays the output from the UAV’s on-board cameras. This provides a better control mechanism than a touch interface alone.
Sensor fusion crops up with multiple sensors. It enables the creation of virtual sensors. For example, a 3D virtual position sensor could be based on inputs from a GPS, an inertial navigation system (INS), plus 3D accelerometers and gyroscopes. The virtual sensor would use the information from all of these sources, but sometimes some may not be available.
For instance, GPS will not work in certain areas where radio reception is poor. INS systems tend to be power-hungry, whereas accelerometers tend to use very little power. A low-power device may not provide high accuracy, but it may be sufficient in many instances. It may also be the only one that is available in a particular location or time frame.
Non-traditional control systems that utilize multiple sensors and sensor fusion abound these days. Google Glass is one example (Fig. 2). The display is the most notable part. Users simply look up to see a large screen that’s really a fraction of an inch. There is also a camera, 3D accelerometers, and audio feedback.
Voice commands and phone calls can be made using the device when it’s linked via Bluetooth to a smart phone. Audio feedback uses bone conduction technology rather than an earbud. An on-board processor offers 16 Gbytes of storage, but it is primarily used in conjunction with the Bluetooth-connected smart phone.
Sahas Katta’s Glass Tesla application runs on Google Glass (see “A View Of Google Glass” at electronicdesign.com). It is designed to work with a Tesla electric car, providing location and charging information as well as limited control of the vehicle.
One needs to experience Google Glass to understand how it changes the way one deals with a hands-free system. Speech recognition is important since it is used to initiate functions such as taking a picture or asking for directions. Imagine looking at a 3D map of your current location and turning your head to see what is nearby. Cutaway or structural views can be presented on screen while viewing the actual environment.
Glass Tesla provides a way to wear sensors and a display, which is a conventional 2D display for one eye. Oculus Rift is a 3D headset from Oculus VR (Fig. 3). Built-in 3D gyroscopes track head movement so the images presented to the displays in the headset can provide a virtual reality environment.
This Kickstarter project is moving from development platforms, with a VGA 3D resolution, to high-definition 1080p resolution that is amazing. There are challenges, though. MEMS gyroscopes are relative devices, and they can have small amounts of drift that can be an issue with a virtual reality headset.
Perhaps this would be less of an issue if it were combined with absolute 3D positioning technology from Sixense (see “Sixense Sensor Provides Real 3D Positioning” at electronicdesign.com). The Sixense system uses a rotating magnetic field to track multiple sensors. It can deliver high-precision, absolute position information under 1 mm. A central controller generates the field, and sensors need to be within about 12 feet. Longer range is possible, but this tends to be sufficient for most applications.
The STEM System is another successful Kickstarter project that provides 3D hand controllers and clip-on sensors (Fig. 4). It is ideal for use with a virtual reality system like Oculus Rift. Typically multiple STEM sensors would be used such as hand controllers in addition to clip-on sensors to more accurately track body movement. The RazerHydra is a wired version of the technology, but STEM utilizes Bluetooth for wireless connectivity.
The ultimate is a combination of Oculus Rift, Sixense’s STEM System, and Virtuix’s Omni (Fig. 5). The Omni is a platform as well as an interface device. The combination is probably the closest thing to a holodeck that can be achieved today.
Users stand in the middle of the Virtuix Omni platform. It has a low-friction, grooved surface with a low-angle, bowl-like architecture. Users wear a special set of pinned shoes that slide easily along the groves. This stabilizes the feet and prevents them from sliding sideways. Users slide back to the center even with continuous walking movement in any direction.
Users also wear a belt that is connected to the stabilizing ring. They can then walk, run, jump, and slide in place. Like most simulations, it is not perfect but it is very good.
The system translates general movements into actions that a game can take advantage of so the display presented on the virtual reality headset will replicate these actions in the virtual world. The addition of the STEM System provides additional feedback. Virtuix Omni will work with other controllers as well.
Gaming is where the most activity around devices like Virtuix Omni, Oculus Rift, and STEM Systems is occurring, but it is not the only place where virtual reality will make a difference. In fact, non-gaming applications will likely be more important as the technology becomes more available. Sixense’s MakeVR software, which utilizes STEM, is an easy to use 3D CAD system that can generate designs for 3D printers.
3D Video Image Recognition
3D video playback has not been a resounding success in the HDTV market, but 3D image recognition has. Microsoft’s original Kinect, based on PrimeSense 3D imaging technology, has been a huge hit for Microsoft’s XBox (see “How Microsoft’s PrimeSense-Based Kinect Really Works” at electronicdesign.com). It has also been a boon for robotics developers, providing a low-cost 3D sensing system.
The PrimeSense approach emits an infrared pattern that is then read by an image sensor and analyzed by a system-on-chip (SoC). The deformation provides 3D depth information. The system has a matching color camera, so these images can be combined with the depth information as well.
The second incarnation of the Kinect utilizes a different time-of-flight technology like that developed by SoftKinetic (see “Time-Of-Flight 3D Coming To A Device Near You” at electronicdesign.com). In this case, a simple infrared emitter is used and a special image sensor can detect the timing associated with the light pulses.
SoftKinetic provides development platforms that work in near-field configurations like that found in front of a laptop or far-field that would be needed for a stand-up gaming system like the Kinect. The primary difference between near-field and far-field operation is the intensity of the infrared diode. Far-field operation requires more power that would blind the sensor in near-field operation.
The Creative Senz3D looks like a typical HD clip-on USB camera, but it incorporates the near-field version of SoftKinetic’s engine (Fig. 6). Like HD cameras, the 3D systems could be built into mobile devices like laptops and tablets.
Microsoft provided a software development kit (SDK) for its Kinect platform after hackers turned the initially closed device into a practical tool. Now gesture recognition can be achieved using the Kinect for applications like robotics. The SDK does the heavy lifting, including support for skeletal tracking and 3D gesture recognition.
Intel’s Perceptual Computing SDK is another framework for working with 3D imaging and more because it also addresses other sensor inputs including audio. Creative’s Senz3D is the 3D imaging hardware reference platform for the SDK.
Yet another 3D imaging technology is available from LeapMotion. Like the aforementioned platforms, it is available as a USB-based device and supported by gesture recognition software. LeapMotion’s approach also uses a set of infrared emitters and a sensor packaged in a small dongle that sits in front of a laptop so it can see a user’s fingers and hands when gestures are performed in front of the device.
LeapMotion’s technology is built into HP’s Envy notebook (Fig. 7). It is integrated with Microsoft Windows, enabling the user to control the interface without touching the screen. The advantages of integration are significant since placement is fixed with respect to the screen and the sensor is hidden within the case.
3D is not a requirement for a useful image recognition tool. Sufficient resolution and processing power are all that are needed. Processing power can be significant so an Arduino platform might be impractical, but heftier compute platforms like a Tegra 3 or 4 do have the horsepower to perform this type of analysis.
For example, the Vital Sign Camera application from Philips can detect heart and breathing rates using the video stream from a conventional camera on most mobile devices (see “Webcams For Gesture Recognition And More Vision Tricks” at electronicdesign.com).
PointGrab provides Microsoft Windows-based 3D gesture recognition using the typical built-in camera found on notebooks and tablets. Pinch and zoom hand gestures can be used to interact with applications without a touchscreen interface (watch “PointGrab Gesture Control Software Integrates With 2D Device Cameras” at engineeringtv.com). Its 3D precision is not as high as the 3D devices already mentioned, but that is often unnecessary for analyzing gestures and relative motion where visual biofeedback is sufficient.
Systems like the Kinect that use infrared imaging do not work well in many environments, such as in daylight where sunlight can overpower and blind the sensors. PointGrab’s system will be limited by the camera as well, but more light is usually better.
Dual-camera 3D imaging systems are also available, but they have yet to scale to consumer level products. They also have high computational requirements. Camera-only solutions can suffer from aliasing issues in the analysis software as well.
Voice recognition and control has been around for decades with significant improvement. It only requires a microphone and a speaker for feedback, so it is even lower in cost than imaging systems. Wading through automated voice call centers is no fun, but you might find the latest interactive voice response (IVR) systems to be rather fluent and understanding.
IVR is a combination of steady voice recognition improvements and the ability to apply more processing power to the problem. Improved audio processing also removes background noise and improves the starting point for voice recognition software.
Voice recognition has become more common in addition to IVR systems. It can be found on most automotive navigation systems, and Apple’s Siri brought the world’s attention to voice recognition on smart phones.
The challenge with voice recognition compared to image processing is that the expectations for voice recognition are much higher. Most people expect a system to understand the meaning of a statement they issue and have the computer act accordingly, whereas the current state of affairs with image recognition is more basic with pinch and zoom gestures activating a limited set of actions.
Fingerprint recognition is used for identification purposes, but its cost and reliability have improved greatly. Various forms of the technology have been available for years, though it has become more common.
Apple’s iPhone 5 is notable because its single button doubles as a fingerprint sensor. The first swipe likely will identify the user. An entire article could be written about the issues surrounding the iPhone 5’s sensor and the security or insecurity associated with it.
Fingerprint sensors are standard fare on other devices like laptops and desktop keyboards. They can even be found on secure external hard drives like those from Apricorn (see “Checking Out Biometric Security” at electronicdesign.com).
Biometric identification is not necessarily restricted to fingerprints. Face recognition using cameras is already available. Applications like Visidon AppLock use the forward-looking camera on smart phones. In the future, biometric sensor fusion with other methodoloies such as voice recognition may provide faster, more secure recognition.
Developers creating interfaces for consumer electronic products now have a wide variety of options that can provide low-cost, high-functionality feedback. Hopefully they will be as understandable and easy to use as keyboards and buttons.
Download this article in .PDF format
This file type includes high resolution graphics and schematics when applicable.