Sharon Oviatt

Department of Computer Science and Engineering
Oregon Graduate Institute of Science and Technology


P.O. Box 91000
Portland, Oregon 97291
office: (503) 690-1342
FAX: (503) 690-1548


www: http://www.cse.ogi.edu/~oviatt


Primary-- (5) Usability and User-Centered Design. Secondary-- (2) Speech and Natural Language Understanding; (3) Other Communication Modalities).


multimodal interfaces; spoken and pen-based interaction; dynamic interactive visual displays; error patterns and resolution; predictive modeling


Multimodal systems offer the potential for considerable flexibility, broad utility, and use by a larger and more diverse population than ever before. However, critical interface issues remain to be addressed before multimodal systems incorporating human language technology can succeed in actual field settings, and can demonstrate major advantages over simpler unimodal alternatives. For example, error resolution currently represents perhaps the most challenging obstacle to successful commercialization of recognition-based systems. One objective of this research is to model users' adapted language and performance during interactions involving error resolution, and to apply this information to the design of better error avoidance and resolution techniques for human language technology and multimodal systems. Another objective is to examine the impact of system display characteristics on users' subsequent linguistic input to multimodal systems, and to apply this information to the design of interfaces that effectively but transparently guide users' input to match system processing capabilities. To accomplish these objectives, a constellation of experiments are planned to investigate basic dimensions of human-computer interaction and multimodal interface design. These experiments will be conducted using a novel semi-automatic simulation technique that supports rapid subject-paced interaction with spoken, pen-based, and multimodal input. The primary significance of this research program will be a more principled empirical and theoretical foundation for understanding human interaction with interactive systems, as well as the improved design of advanced multimodal interfaces that incorporate human language technology.


Oviatt, S. L. Predicting spoken disfluencies during human-computer interaction, Computer Speech and Language, 1995, 9, 1, 19-35.

Cohen, P. R. and Oviatt, S. L. The role of voice input for human-machine communication, Proceedings of the National Academy of Sciences,1995, vol. 92, no. 22, 9921-9927.

Oviatt, S. L., Cohen, P. R. and Wang, M. Q. Toward interface design for human language technology: Modality and structure as determinants of linguistic complexity, Speech Communication, European Speech Communication Association, 1994, vol. 15, nos. 3-4, 283-300.

Oviatt, S. L. and Olsen, E. Integration themes in multimodal human-computer interaction, Proceedings of the International Conference on Spoken Language Processing, (ed. by Shirai, Furui and Kakehi), Acoustical Society of Japan, 1994, vol. 2, 551-554.


In recent work, we have been interested in examining different communication models and modalities relevant to interactive system design (e.g., unimodal vs. multimodal; two-party vs. multi-party; unilingual vs. multilingual; interactive vs. noninteractive). Part of our work focuses on defining the primary characteristics of a communication modality, identifying any constellation of problems that may exist (i.e., especially for technologically-mediated hybrid forms of communication), and proposing methods for optimizing human performance when using that technology in the future (e.g., through alteration of the basic system or interface design). Our research goals typically involve specifying preliminary target requirements for future systems not yet in existence (e.g., speech and pen recognition; multimodal systems incorporating human language and image-based technology). For this purpose, evaluations are conducted proactively, usually using a highly-interactive simulation method. The results generated by this work are designed to create a "guidance system" for complex technology still in the planning stages.

Research based on this methodology is being used to investigate a wide range of topics relevant to interactive systems, recognition-based technology, and multimodal system design, including: (1) the impact of system displays, prompts, and feedback on subsequent user input, (2) techniques for guiding people's language input to coincide with system processing capabilities, such that processing can be robust, (3) error patterns and resolution strategies unique to recognition-based technologies involving language input, (4) identification of optimal niche applications for component technologies within a multimodal interface, (5) integration and synchronization of modalities within a multimodal interface, (6) predictive modeling of human interaction with interactive unimodal and multimodal systems, and evaluation of the extent to which such systems succeed in supporting human performance.


Levelt. W.J. Speaking: From Intention to Articulation, MIT Press: Cambridge, Ma., 1989.

Cole, R., Hirschman, L., et al., The Challenge of Spoken Language Systems: Research Directions for the Nineties, IEEE Transactions on Speech and Audio Processing, 1995, 3:1, 1-21.

Rhyne, J. R. and Wolfe, C. G. Recognition-based User Interfaces, in Advances in Human - Computer Interaction (ed. by H. R. Hartson and D. Hix), Ablex: Norwood, N.J., 4: 7, 191-250.


(1) Virtual Environments
(3) Other Communication Modalities
(4) Adaptive Human Interfaces
(6) Intelligent Interactive Systems for Persons with Disabilities