MIT AgeLab / New England University Transportation Center Research on Drivers’ behavior with automotive production level “voice command” interfaces

MIT AgeLab / New England University Transportation Center Research on Drivers’ behavior with automotive production level “voice command” interfaces

Tue, 11/19/2013

Automotive control systems that are designed to keep drivers’ hands on the wheel may sometimes actually lead people to take their eyes off the road. That’s one of many conclusions from a new study on vehicle voice command systems sponsored by Toyota’s Collaborative Safety Research Center (CSRC), The Santos Family Foundation and the Region 1 New England University Transportation Center at the Massachusetts Institute of Technology.

The study found that drivers take their eyes off the road to utilize voice command systems more often than expected. When using the voice interface for radio tuning, drivers keep their eyes on the road for greater percentage of the time than when completing the identical task using traditional manual interaction – in line with general expectations for the potential advantage of this type of interface. However, in activities such as voice entry of a destination address into the in-vehicle navigation system and selection of a song from a USB connected storage device, the visual attention of the “voice” interaction drew driver’s eye away from the road for periods beyond current industry and NHTSA guidelines for visual-manual interactions. (It is important to note that current NHTSA visual distraction guidelines explicitly state that they do not apply to “voice” interfaces.) Some of these behaviors were often more pronounced among older drivers, some of whom were found to physically orient their bodies towards the voice command system’s graphical interface when engaging it.

While significant research has been conducted on handheld and experimental voice interaction systems, the study represents one of the most comprehensive efforts to examine the demands placed on drivers’ attention by systems currently available in the market, while also developing data that could support NHTSA in the development of its Phase III distraction guidelines.

The study was initially launched to investigate how or whether the mental demands imposed by the use of voice command systems impacted a driver’s focus on the road ahead. Researchers actually found that cognitive load as measured by physiological arousal and self-report was lower than expected, likely in part because drivers compensated for their use of voice command systems by slowing down, changing lanes less frequently or increasing the distance to vehicles ahead.

But the study also showed that the problem of driver distraction is more complicated than previously known. Modern vehicle control systems are highly multi-modal, placing demands on the driver’s attention in multiple ways, including visual, manual and auditory senses, among others. Much of the glance behavior observed during voice tasks was associated with looking at a console display screen to view options presented by the system, such as available command or to select from lists if the system identified multiple options for street names during address entry. Such support displays are often used to reduce the amount of cognitive load that would be placed on the driver by having to remember specific command phrases or needing to listen to an extended list of destination options. There clearly is a human factors design challenge to find a balance across these demands to optimally support drivers of differing capabilities and preferences.

Taken together, the results of the study contribute extensive data that can be used by the auto industry, other academic researchers, and the government in further refinement of guidance in the development of future voice systems. Since technologies developed to reduce one type of distraction may actually increase others, future designs need to take a more comprehensive approach, working from a goal of increasing the driver’s ability to focus on the road. This research highlights the question of how an acceptable level of demand should be defined in the context of multi-step and extended task time interactions that characterize activities involving voice-command interfaces.

Overall, the study illustrates the necessity for additional research assessing the generalizability of these findings to other production level and hand-held “voice” interactions, and in developing methods of quantitatively assessing the net attentional costs and benefits of providing drivers with information across different modalities. The study’s lead authors, Bryan Reimer and Bruce Mehler, emphasize that the results should not be interpreted as establishing any safety risks associated with the glance behavior observed in the research; what they do demonstrate is that, depending on the task and the design of the interface, it cannot automatically be assumed that simply by including voice in a vehicle interface will mean that a driver will keep their eyes on the road and result in a net safety advantage. Future naturalistic and/or epidemiological research will be necessary to gauge the degree to which interaction with systems such as those studied here present any elevation in actual driving related risk.

Voice interactions can play an important role in the vehicle environment. Optimizing the selection of activities in which the driver utilizes voice interaction and the appropriate design of displays will help to maximize driver attentional focus towards information necessary for vehicle operation, while allowing, where appropriate, interactions with interfaces for comfort, convenience and communication functions. Ongoing AgeLab research continues to focus on these and other questions related to driver behavior with advanced vehicle technologies.

The white paper providing an overview of key findings of the study can be found here.

A comprehensive technical report on the study can be found here and appendices here.

Videos illustrating the voice command tasks utilized in the study are available on You Tube at the following links:

Manual Radio Tuning
Voice Radio Tuning
Voice Navigation Entry
Voice Song Selection
Voice contact Dialing
N-Back calibration tasks

Citations to the research should appear as follows:

Reimer, B. & Mehler, B. (2013). The Effects of a Production Level “Voice-Command” Interface on Driver Behavior: Summary Findings on Reported Workload, Physiology, Visual Attention, and Driving Performance. MIT AgeLab White Paper No. 2013-18A. Massachusetts Institute of Technology, Cambridge, MA.

Reimer, B., Mehler, B., Dobres, J. & Coughlin, J.F. (2013). The Effects of a Production Level “Voice-Command” Interface on Driver Behavior: Reported Workload, Physiology, Visual Attention, and Driving Performance. MIT AgeLab Technical Report No. 2013-17A. Massachusetts Institute of Technology, Cambridge, MA. (Note: due to size considerations, .pdf versions of the report may appear as two files, a main report and an appendix.)

An eariler report on portions of the study appeared in the Proceedings of the 7th International Driving Symposium on Human Factors in Driver Assessment, Training, and Vehicle Design. Bolton Landing, NY. The full citation to that report follows:

Reimer, B., Mehler, B., McAnulty, H., Munger, D., Mehler, A. Garcia Perez, E.A., Manhardt, T. & Coughlin, J.F. (2013). A Preliminary Assessment of Perceived and Objectively Scaled Workload of a Voice-Based Driver Interface. Proceedings of the 7th International Driving Symposium on Human Factors in Driver Assessment, Training, and Vehicle Design. Bolton Landing, NY. 537-543.

 

MIT AgeLab
1 Main Street, 9th Floor
Cambridge, MA 02142
ph: 617.253.0753
email: agelabinfo(at)mit.edu

Go to top