WRITING AND TALKING TO FUTURE INTERACTIVE SYSTEMS
Sharon Oviatt
Department of Computer Science and Engineering
Oregon Graduate Institute of Science and Technology
CONTACT INFORMATION
P.O. Box 91000
Portland, Oregon 97291
oviatt@cse.ogi.edu
office: (503) 690-1342
FAX: (503) 690-1548
WWW PAGE
www: http://www.cse.ogi.edu/~oviatt
PROGRAM AREA
Primary-- (5) Usability and User-Centered Design.
Secondary-- (2) Speech and Natural Language Understanding;
(3) Other Communication Modalities).
KEYWORDS
multimodal interfaces; spoken and pen-based interaction; dynamic
interactive visual displays; error patterns and resolution; predictive
modeling
PROJECT SUMMARY
Multimodal systems offer the potential for considerable flexibility, broad
utility, and use by a larger and more diverse population than ever before.
However, critical interface issues remain to be addressed before multimodal
systems incorporating human language technology can succeed in actual field
settings, and can demonstrate major advantages over simpler unimodal
alternatives. For example, error resolution currently represents perhaps
the most challenging obstacle to successful commercialization of
recognition-based systems. One objective of this research is to model
users' adapted language and performance during interactions involving error
resolution, and to apply this information to the design of better error
avoidance and resolution techniques for human language technology and
multimodal systems. Another objective is to examine the impact of system
display characteristics on users' subsequent linguistic input to multimodal
systems, and to apply this information to the design of interfaces that
effectively but transparently guide users' input to match system processing
capabilities. To accomplish these objectives, a constellation of
experiments are planned to investigate basic dimensions of human-computer
interaction and multimodal interface design. These
experiments will be conducted using a novel semi-automatic simulation
technique that supports rapid subject-paced interaction with spoken,
pen-based, and multimodal input. The primary
significance of this research program will be a more principled empirical
and theoretical foundation for understanding human interaction with
interactive systems,
as well as the improved design of advanced multimodal interfaces that
incorporate human language technology.
PROJECT REFERENCES
Oviatt, S. L. Predicting spoken disfluencies during human-computer
interaction, Computer Speech and Language, 1995, 9, 1, 19-35.
Cohen, P. R. and Oviatt, S. L. The role of voice input for human-machine
communication, Proceedings of the National Academy of Sciences,1995,
vol. 92, no. 22, 9921-9927.
Oviatt, S. L., Cohen, P. R. and Wang, M. Q. Toward interface design for
human language technology: Modality and structure as determinants of
linguistic complexity, Speech Communication, European Speech
Communication Association, 1994, vol. 15, nos. 3-4, 283-300.
Oviatt, S. L. and Olsen, E. Integration themes in multimodal human-computer
interaction, Proceedings of the International Conference on Spoken
Language Processing, (ed. by Shirai, Furui and Kakehi), Acoustical Society
of Japan, 1994, vol. 2, 551-554.
AREA BACKGROUND
In recent work, we have been interested in examining different
communication models and modalities relevant to interactive system design
(e.g., unimodal vs. multimodal; two-party vs. multi-party; unilingual vs.
multilingual; interactive vs. noninteractive). Part of our work focuses on
defining the primary characteristics of a communication modality,
identifying any constellation of problems that may exist (i.e., especially
for technologically-mediated hybrid forms of communication), and proposing
methods for optimizing human performance when using that technology in the
future (e.g., through alteration of the basic system or interface design).
Our research goals typically involve specifying preliminary target
requirements for future systems not yet in existence (e.g., speech and pen
recognition; multimodal systems incorporating human language and
image-based technology). For this purpose, evaluations are conducted
proactively, usually using a highly-interactive simulation method. The
results generated by this work are designed to create a "guidance system"
for complex technology still in the planning stages.
Research based on this methodology is being used to investigate a wide
range of topics relevant to interactive systems, recognition-based
technology, and
multimodal system design, including: (1) the impact of system displays,
prompts, and feedback on subsequent user input, (2) techniques for guiding
people's language input to coincide with system processing capabilities,
such that processing can be robust, (3) error patterns and resolution
strategies unique to recognition-based technologies involving language
input, (4) identification of optimal niche applications for component
technologies within a multimodal interface, (5) integration and
synchronization of modalities within a multimodal interface, (6) predictive
modeling of human interaction with interactive unimodal and multimodal
systems, and evaluation of the extent to which such systems succeed in
supporting human performance.
AREA REFERENCES
Levelt. W.J. Speaking: From Intention to Articulation, MIT Press:
Cambridge, Ma., 1989.
Cole, R., Hirschman, L., et al., The Challenge of Spoken Language Systems:
Research Directions for the Nineties, IEEE Transactions on Speech and
Audio Processing, 1995, 3:1, 1-21.
Rhyne, J. R. and Wolfe, C. G. Recognition-based User Interfaces, in
Advances in Human - Computer Interaction (ed. by H. R. Hartson and D. Hix),
Ablex: Norwood, N.J., 4: 7, 191-250.
RELATED PROGRAM AREAS
(1) Virtual Environments
(3) Other Communication Modalities
(4) Adaptive Human Interfaces
(6) Intelligent Interactive Systems for Persons with Disabilities