AgeLab / New England University Transportation Center releases new research on voice interfaces in cars

AgeLab / New England University Transportation Center releases new research on voice interfaces in cars

Thu, 09/25/2014

Automotive voice-command interfaces may require more visual engagement from drivers than previously thought, a recent series of MIT AgeLab / New England University Transportation Center studies have found.

Vehicles are incorporating voice-command interfaces at higher rates than ever before; while only 37 percent of cars had voice recognition systems in 2012, this figure is predicted to reach 55 percent by 2019. This increasing presence has been driven in part by the assumption that voice interfaces are a hands-free, eyes-free way of communicating with vehicle systems that is less demanding than traditional manual controls, which may require drivers to take their eyes off the road for periods of time.

AgeLab researchers Bryan Reimer, Bruce Mehler, and colleagues set out to assess whether and to what extent voice-command interfaces impact driver behavior. As the results came in, however, a more complex picture emerged.

In the research, published initially as a technical report (see links below), participants drove an instrumented test vehicle on a multi-lane highway and performed a range of tasks, such as tuning the radio or entering a destination into the navigation system. Radio tuning tasks were completed using both voice initiated and traditional manual controls. The more extensive address entry task was carried-out using the voice interface alone. The AgeLab team developed a comprehensive picture of the demands placed upon drivers using a variety of measures. While driving, participants wore sensors that monitored their heart rate and skin conductance levels. These physiological indicators provide an objective measure of workload associated with interacting with secondary activities while driving. Changes in driving behaviors such as adjustments in vehicle speed and steering wheel control were logged. Perhaps most crucially, cameras located on the dashboard monitored participants’ eyes through the duration of each drive, recording their every glance. This footage, when quantified, provided critical information on exactly how much time drivers spent with their eyes directed toward the road versus visually engaging with equipment inside the car. Finally, participants rated how much workload they felt was involved in completing each assignment.

Participants rated most voice tasks as intermediate between easy and hard manual radio tuning (pressing a single radio present button vs. selecting a station by pressing multiple radio control buttons and rotating the tuning knob). Similarly, objective measures (heart rate and skin conductance) suggested low to moderate workload. The amount of time that drivers took their eyes off the road were shorter using the voice interface to tune to a specific radio station. Thus, there were clear advantages provided through the voice interface as compared to the manual interface for tuning the radio. On the other hand, it was observed that the voice-based navigation task was associated with drivers spending a relatively long amount of time (an average of 32.8 seconds) looking away from the road. A second study (see links below) produced similar results in terms of extended eyes-off-road time for voice-based address entry.

All told, the data indicate that some voice-command initiated activities require more visual attention from drivers than previously recognized, i.e., voice systems are not necessarily hands-free, eyes-free ways of communicating with vehicle interfaces. In fact, they are more appropriately considered as multimodal interfaces, drawing upon a varying combination of auditory-vocal-visual-manual and cognitive resources.

But how can a voice-controlled system demand visual attention? Because often, "voice-controlled" doesn't mean "screen-free." In the study, certain voice tasks required drivers to choose an option from a list of items presented on a screen, which meant that visuals were still an important part of the voice task process. These visual prompts were likely included as a key aspect of system design to help reduce mental demand on drivers, such as the need to remember a list of options before making a selection. The study also showed that many drivers had a tendency to speak at the interface screen, much as though it was a person.

In a subsequent study (see links below) that is now available in the proceedings of the International Conference on Automotive User Interfaces and Interactive Vehicular Applications (Auto-UI), the research team compared the default mode of a production voice system with an “Expert” mode which streamlines tasks by removing several extended auditory prompts and vocal confirmatory steps. While the expert mode decreased the overall amount of time that it took drivers to complete a task, they still glanced away from the road and towards the in-vehicle interface screen for about the same amount of time. The research also showed that many drivers had a tendency to speak at the interface screen, much as though it was a person. This pattern was most evident in Study 1 which compared drivers in their 20s and 60s; in this sample, older participants were the most likely to show this “social-like” orienting behavior.

While this research did not consider drivers over age 69, other age related patterns were observed. As might be expected, older drivers tended to report higher levels of workload engaging with the voice interface, took longer to complete tasks, and look away from the roadway longer during tasks. However, many drivers in the older age categories performed just as well as the average younger driver. It is possible that factors other than simple age in years, such as health status, may influence where older drivers fall on these measures. Additional investigation on this question is on-going.

 “If there’s something on the screen moving, the driver will look at it,” said AgeLab research scientist Jonathan Dobres—regardless of whether it's needed to complete the task at hand. Humans have difficulty suppressing motion in the visual field and slight changes to the information on the screen may draw attention.

An answer to lower demand vehicle interfaces may lie in voice control systems that strategically limit presenting visual information. By keeping visual cues to a minimum and only showing words or images when absolutely necessary to support task completion, one may hypothesize that draws on vision will be reduced. However, future research will need to assess if such design strategies actually offset the driver’s tendency to look at the system even when he or she doesn’t need to look and if mental demands are kept reasonably low with a targeted reduction in visually presented prompts. Overall, the lab’s research has shown that it is not broadly understood how modern vehicle interfaces are drawing upon drivers’ attention and what design strategies provide an optimized balance between voice and visual-manual interactions.

Technology in the interface world is constantly evolving, Dobres said, "with companies like Apple and Android working with automotive manufacturers to update current in-vehicle devices at a rapid pace. Further investigation is needed to assess the impact of software such as Apple’s Siri application, which relies solely on voice commands without the presence of a visual interface. Continuing this kind of research is crucial in terms of its potential impact on automotive safety policy". “It is often assumed that voice interface systems will keep your eyes on the road,” Dobres said, “but that’s not necessarily true.”

In summary, in a series of three studies encompassing 156 individuals, results showed that performing tasks with the voice command interface induced low to moderate levels of workload but in some tasks draw upon more than expected levels of visual attention. These findings clearly illustrate that modern vehicle interfaces can be highly multi-modal nature drawing upon various degrees of auditory-vocal-visual-manual and cognitive resources. Consequently, all such potential resource demands should be considered in evaluating drivers interactions with in-vehicle and portable interfaces.

(Update of article initially posted 9/12/2014)

Study 1

Was supported by Toyota’s Collaborative Safety Research Center (CSRC), The Santos Family Foundation and the Region 1 New England University Transportation Center at the Massachusetts Institute of Technology.

The white paper providing an overview of key findings of the study can be found here.

A comprehensive technical report on the study can be found here and appendices here.

Citations to the research are:

Reimer, B. & Mehler, B. (2013). The Effects of a Production Level “Voice-Command” Interface on Driver Behavior: Summary Findings on Reported Workload, Physiology, Visual Attention, and Driving Performance. MIT AgeLab White Paper No. 2013-18A. Massachusetts Institute of Technology, Cambridge, MA.

Reimer, B., Mehler, B., Dobres, J. & Coughlin, J.F. (2013). The Effects of a Production Level “Voice-Command” Interface on Driver Behavior: Reported Workload, Physiology, Visual Attention, and Driving Performance. MIT AgeLab Technical Report No. 2013-17A.Massachusetts Institute of Technology, Cambridge, MA. (Note: due to size considerations, .pdf versions of the report may appear as two files, a main report and an appendix.)

Videos illustrating the voice command tasks utilized in study 1 are available on You Tube at the following links:

Study 1 - Manual Radio Tuning
Study 1 - Voice Radio Tuning
Study 1 - Voice Navigation Entry
Study 1 - Voice Song Selection
Study 1 - Voice contact Dialing
Study 1 - 3 level n-back calibration tasks

Study 2

Was supported by Toyota’s Collaborative Safety Research Center (CSRC) and the Region 1 New England University Transportation Center at the Massachusetts Institute of Technology.

The white paper providing an overview of key findings of the study can be found here.

A comprehensive technical report on the study can be found here

Citations to the research are:

Mehler, B. & Reimer, B. (2014). An Executive Summary: Further Evaluation of the Effects of a Production Level “Voice-Command” Interface on Driver Behavior: Replication and a Consideration of the Significance of Training Methodology. MIT AgeLab White Paper No. 2014-7. Massachusetts Institute of Technology, Cambridge, MA.

Mehler, B., Reimer, B., Dobres, J., McAnulty, Mehler, A., Munger, D. & Coughlin, J.F. (2014). Further Evaluation of the Effects of a Production Level “Voice-Command” Interface on Driver Behavior: Replication and a Consideration of the Significance of Training Methodology. MIT AgeLab Technical Report No. 2014-2.Massachusetts Institute of Technology, Cambridge, MA.

Videos illustrating the voice command tasks utilized in study 2 are available on You Tube at the following links:

Study 2 - Manual Radio Tuning
Study 2 - Voice Radio Tuning
Study 2 - Voice Navigation Entry
Study 2 - 4 level n-back calibration tasks and prompted reference

Study 3

Was supported by The Santos Family Foundation and the Region 1 New England University Transportation Center at the Massachusetts Institute of Technology.

A paper to appear at the 6th International Conference on Automotive User Interfaces and Interactive Vehicular (September 17 - 19, 2014) can be found here

Citation to the research is:

Reimer, B., Mehler, B., Dobres, J., McAnulty, H., Mehler, A., Munger, D., & Rumpold, A. (2014). Effects of an ‘Expert Mode’ Voice Command System on Task Performance, Glance Behavior & Driver Pysiology. Proceedings of the 6th International Conference on Automotive User Interfaces and Interactive Vehicle Applications (AutoUI 2014), Seattle, WA.

Videos illustrating the voice command tasks utilized in study 3 are available on You Tube at the following links:

Study 3 - Manual Radio Tuning
Study 3 - "Expert Mode" Voice Radio Tuning
Study 3 - "Expert Mode" Voice Navigation Entry
Study 3 - 4 level n-back calibration tasks and prompted reference

 

 

MIT AgeLab
1 Main Street, 9th Floor
Cambridge, MA 02142
ph: 617.253.0753
email: agelabinfo(at)mit.edu

Go to top