For the PDF version of this article, click here.
Today's unique human-machine interface (HMI) experience starts when the door unlocks for the driver and he or she pushes a button to start the vehicle. However, the areas that are getting the greatest focus from carmakers and a number of suppliers are those HMIs that impact the driver while the vehicle is moving. The well-known quote, “Hands on the wheel, eyes on the road, and mind on the drive” is being taken very seriously.
New HMIs are technology driven and driver driven according to Scott Geisler, lead engineer, noise and vibration performance at General Motors Corp. “People have always seemed to want to try and do more things in a car than just drive,” he said. “As complicated and variable task as it can be.”
People interface differently with their vehicles today. The requirements are for more effective and safer use of technology. “As we add more features to the vehicle, it becomes more of a challenge to be able to communicate with the driver or allow the driver to communicate with these devices and not impose an incremental workload on the driver,” said John Barclay, director of advance cross system development, Visteon.
There appears to be high agreement that spoken commands are the best way to go for HMI technology. However, opinions vary on the readiness of these systems. Cost is one of the deterrents. Different languages and accents, dialects and even usage of language add to the complexity and challenges of speech input. In their approach, shown in Figure 1, Siemens is looking at a combination of techniques with expanded visual with a heads-up display, some speech recognition, and touch with handwriting recognition to involve as many senses as possible. Nhu Thien Nguyen, the lead innovator of Siemens' EasyCo technology, said the unit uses 5 MIPS to 10 MIPS of 32-bit computing performance along with handwriting recognition and text-to-speech software. It takes from 200 KB to 1 MB of memory and requires an LCD or TFT display.
Siemens haptic turn-push knob gets harder or easier to move providing feedback to the user. The feel of the knob becomes an input device. The single unit eliminates buttons and allows one central control to provide extensive menu options and intuitive navigation and selection. The Siemens unit in BMW's iDrive has four different quadrants. The knob clicks for some higher-level choices but in some instances, such as volume control or temperature setting, the knob provides continuous change without clicking. In Siemens' CESAR modular cockpit concept, sensors in the seat identify the driver or passenger so one button can be used to change the driver's side or the passenger's side air temperature to activate the heated seat in that zone. The driver's or passenger's body becomes part of the HMI.
The optimized use of displays is important because suppliers need to deliver technologies that are intuitive and easy to operate and that includes the display, display mechanisms, and the actual controls. This could be a touch screen on a display, a rotary knob or a push-pull knob. “It's really up to us to make sure that all of these controls are intuitive, easy to operate and easy to use and don't put an undue burden on the driver,” said John Barclay, director of advance cross system development, Visteon Corp.
Some touch screen displays merely require proximity and not direct touch of the screen to bring up the right level of information. Besides avoiding smudging the screen, these newer types of displays can reduce driver workload. “At this point it is not clear which is more convenient for the driver,” said Barclay.
The jury is certainly out on the exact way for automakers and suppliers to proceed. “Even in a small segment like portable music players there is no agreement on HMI,” said Jack Morgan, senior director, automotive marketing and sales for North America, Philips Semiconductor. The HMI development in the home and portable products is definitely going to impact future vehicle systems he insisted, with the initial impact definitely in the entertainment system.
VOICE COMMANDS AND THE TALKING CAR
“The thing that is really starting to emerge, we see it deploying today and we really think it represents potential, is the speech/hearing interface,” said GM's Geisler. This is because driving is essentially a visual/manual task. Carmakers need to minimize the loading that might compete with the visual/manual driving. “To meet the goals of ‘hands on the wheel, eyes on the road, mind on the drive’ you want to keep the task as simple as possible, minimize the investment into getting things done,” he added.
At another level, carmakers want to have things as common as possible. “For example, in the HMI world for secondary systems, such as entertainment and navigation, we are somewhat where primary controls were at the beginning of the last century with tillers, sticks, and steering wheels as options, but now we have common interfaces,” said Geisler. The secondary controls are not as common, especially at a very deep level. Geisler noted that applied research suggests there is an opportunity to allow people to do the things they need and want to do by using a separate channel — listening and speaking back. Of course, this happens only if the technology can support this approach.
For voice recognition there are two types of input: speaker independent and speaker dependent. The OnStar system available on more than 50 GM models for 2005 deploys a mixture of both. The initial command structure tends to be speaker independent, so just about anyone can initially engage the activity without any training for dialect, accent, etc. Command words are speaker independent. Speaker-dependent input uses prestored voice tags. For example, “call” is speaker independent but “home” is a speaker-dependent voice tag that must be programmed in a speaker-dependent mode in the software. When the white button on OnStar's three-button interface is pressed, the system prompts the user with “a “Connecting to OnStar” voice message and accepts speech input with the personal calling package.
Visteon is the first supplier of integrated, factory-installed voice-activated Bluetooth wireless systems in automotive production in Europe and North America. In Visteon Voice and Mach Voice Link, the voice recognition engine and critical components such as the HMI framework are all Visteon proprietary, according to Mike Bryars, senior manager, Visteon Electronics product line team. These proprietary components have enabled Visteon to deliver unique features and performance.
The size of the memory in the Visteon system depends upon the particular features implemented for a given product but the base system configuration contains 8 MB of storage for the data and applications. Figure 2 shows Visteon's MACH Voice Link module that is fully integrated into the vehicle's electronic system. Today, the system does not employ natural voice recognition. “Natural language voice systems require significantly more memory,” said Bryars. “But with the compression technologies and MIPS bandwidth of processors available today a 2x increase in memory size is probably achievable.” The real challenge is getting these more complex technologies to work reliably and intuitively in the car under real world driving conditions.
Other suppliers entering the HMI technology race with speech-recognition technology include Microsoft and Motorola. Figure 3 shows a system from each supplier. Microsoft's TBox is a reference design defined by Microsoft and Fiat Auto for Bluetooth voice and USB connectivity. Motorola's IHF1000 is being offered for aftermarket sales.
The TBox has a 300 MHz to 400 MHz processor and limited onboard memory. “The devices we are building have 32 megabytes of RAM and 32 megabytes of Flash, but the cool thing is the storage is actually being driven by the consumer,” said Peter Wengert, group marketing manager for Microsoft's automotive business unit. “We are basically indexing these files and creating the grammar for the voice technology to work, so you can play by album, artist or play list.”
There are two different TBox versions that will be introduced by Fiat. The basic unit has Bluetooth capability as well as a USB port that will ship with every car. Bluetooth is the primary input for the hands-free phone connection. The USB port is essentially for digital music integration, a mass storage device such as an MP3 player, a WMA player or a USB memory card.
The secondary input is a small LCD screen in the instrumentation area that is already used for diagnostics. The unit displays info such as who is being called. The driver uses steering wheel buttons to cycle through the different commands and options of the secondary input. The number of buttons will vary but it will range from four to eight.
The second, more advanced version will have a GPS chip for location-based information and a GSM module, where the phone can be integrated into the design. The computing requirements do not change between the basic or GPS with GSM version. More software is added to take over a two-way connection but that still works within the 32 MB of RAM and 32 MB of flash. Off the shelf, the box goes for about $100 to a Tier 1 or OEM and with a large manufacturer involved, Wengert expects this cost to come down.
Microsoft has performed extensive research and consumer awareness testing around voice technology in the car. “What we have found with our research is that you have to keep it simple and it's better for your accuracy to come up with a limited set of commands,” said Wengert. Some of Microsoft's systems only have about 60 commands. One technology they are not currently addressing is dictation. “We are not going to be composing e-mails or doing street address inputs — that's probably a technology that is about five or 10 years out,” noted Wengert. Figure 4 shows the TBox architecture.
Motorola's IHF1000 underwent a number of tests where consumers evaluated it in a driving simulator observing and analyzing the interface and providing feedback to designers. It is not just a simple command and control system. A dialog management technology developed by Motorola allows the user to bring up various dialogs that give them greater freedom than a general command and control system. “In products of the future, you should be able to expect to see much more dynamic and even freer, more natural-like dialog,” said Mike Gardner, director of intelligence systems research, Motorola Labs.
CENTRALIZED VS. DISTRIBUTED HMI
Visteon's TACNET shown in Figure 5 is more of a centralized HMI vs. a distributed HMI. The system is offered for law enforcement agencies and emergency response vehicles. With this system, Visteon is learning how to appropriately display information, the best way to communicate that information back, and how to optimize the interface in that kind of intense environment. “As we look at some HMI or control devices in vehicles today there seems to be more of a tendency toward centralization vs. decentralization,” Barclay said. Locating controls in a common place so that passengers and drivers don't have to search for the buttons is important. At the same time, Barclay sees the need for more distributed control around the passenger compartment to personalize an audio system, an infotainment system, or even a climate control system.
“In the near term at least, in the next five to seven years, we are going to see a combination of centralized control and decentralized controls in the vehicle environment,” said Barclay. “If we extend the wireless technology developed for headphones it will allows us to take the HMI for occupants or driver and put it in any remote location — any location convenient for the driver.” Visteon is looking into other wireless technologies that could impact driver interfaces including Bluetooth, IEEE 802.11 and wireless USB.
READ MY LIPS
To address the noise issues in vehicles and improve system robustness, algorithms will certainly be improved. However, the combination of visual information with speech recognition provides carmakers another avenue. “A camera focused on the speaker can actually read their lips,” said Motorola's Gardner. The lip tracking system combined with speech will provide even further robustness in a longer time frame.
Pointing a camera at the driver has other potential benefits. “We can look for drowsy driver indications with the percent of eye closure,” said Gardner. “We can also look at the eye scanning patterns.” The camera would be able to help the drowsy driver situation by recognizing symptoms and alerting the driver. Such a system could also address inebriation, distraction or medical alert situations.
Both suppliers and carmakers have a number of opinions regarding where HMIs are headed. Over the next five to 10 years, Visteon's Barclay does not anticipate any surprises for interface technology. “I think you will see a slower evolution exploiting voice technology. For the most part, displays will continue to be developed, displays that we no longer have to touch that interact with voice as well and some combination — some synergy,” said Barclay. “I think in the future we will probably see the capability to customize these devices to some degree, especially displays to meet the driver or the occupant needs so we would like them to be more and more flexible.”
According to Motorola's Gardner, “The next big step is to make continuous speech interface systems better. Where blurring of the number without extensive pauses for individual digits is allowed.” This will allow the system to handle more natural dialog, where the system picks out the important information. Gardner indicated that this kind of dialog could be available for automakers to implement within the next two to five years.
“There will be a lot of interaction and most of that interaction is going to involve a high amount of computing power,” said Philips' Morgan. “I don't see compute power as a limitation to bringing in these HMIs, the limitations are more in the algorithms and the touch and feel aspects of the HMIs.”
GM's Geisler summed up the current situation, and said, “We are at the beginning of a very interesting explosion of technology (requiring) input from customers and our other stakeholders as to what is desirable, permissible and effective. And it's a very dynamic and uncertain environment.”
ABOUT THE AUTHOR
Randy Frank is president of Randy Frank & Associates Ltd., a technical marketing consulting firm based in Scottsdale, Ariz. He is an SAE and IEEE Fellow and has been involved in automotive electronics for more than 25 years. He can be reached at [email protected].