Toward a Theory of Metacontrol for Dialogue Systems
Alan W. Biermann
Department of Computer Science
Box 90129,
Duke University
Durham, NC 27708-0129.
Email : awb@cs.duke.edu
Tel:(919)660-6500
Fax:(919)660-6519
WWW Page : http://www.cs.duke.edu/~awb
PROGRAM AREA
Speech and Natural Language Understanding.
KEYWORDS
Voice interactive systems, human-machine collaboration, multimedia
systems, user modelling, dialogue theory, human factors.
PROJECT SUMMARY
The Need
When humans collaborate with each other, they undertake a variety of
behaviors that enable fast and efficient convergence to the goal. First,
each participant is continuously involved in mental problem solving and the
interactions are the result of the internal processing. When one
participant believes he or she has reached a point where communication will
enhance progress, they send a message to the other individuals. They do not
start at the beginning and follow a strategy of comprehensive communication.
Instead, they abruptly jump to key issues in the problem solving situation
and address them aggressively. At unpredictable places, they drop the
current discussion and jump to some other seemingly relevant issue either
resolving it or dropping it also to jump to yet another or to some previous
discussion. This light-footed jumping from issue to issue as the problem
solving scenario goes forward, enables the participants to attack sub
problems as quickly as their minds can create them and to deal with them
effectively so other issues can be addressed.
The effectiveness of the interaction is dependent on many additional
characteristics. For example, each communication from one participant to
the other must carefully account for the other's knowledge level. Utterances
must start from a knowledge level within the other's grasp and move in the
direction of communicating the target fact. Another important feature is
that the conversational initiative must go to the participant most able to
direct progress. The initiative must pass back and forth between them as
one participant or another becomes the most appropriate leader in subsequent
interactions. Yet another
important feature is that so called meta-statements need to be uttered from
time to time to mark progress, to announce successes, and to direct next
steps. These interactions help maintain morale and keep participant efforts
synchronized.
The Theory
The tremendous effectiveness and efficiency of human-human collaborations
can be approximated in human-machine collaborations to the extent that the
above characteristics can be duplicated. Our project is developing a theory
for the machine participant that enables these capabilities to some extent.
Specifically, our system represents knowledge in a Prolog-like language and
it does theorem proving to attempt to solve problems. Our strategy is to
build the capabilities described above into our machine collaborator by
augmenting the Prolog theorem proving mechanisms. For example, the theorem
proving mechanism attempts to solve the problem at hand using its internal
mechanisms. It does not initiate communication until a problem arises where
some part of the proof cannot be completed. If it determines from the user
model that the other participant may be able to contribute to the currently
failing sub proof, it will execute a very focused interaction with the user
to attempt to find how to finish that sub proof. The attempt at the sub
proof can result in diversions to other sub proofs which when solved enable
its own success. The associated interactions are the sub dialogues that are
so common in human-human interactions.
The Prolog style theorem proving provides an excellent environment for
examining and to some extent solving the other issues of dialogue. Thus
there are simple and natural means for encoding a user model into the
knowledge base so that interactions are limited to user appropriate
communications. It is possible to judge where initiative should be taken or
released to the other and to properly lead or follow in each specific
situation. It is possible to devise a theory of meta-statements and to
appropriately assert them or receive and process them.
The Implementation and Results
Our project has implemented several voice interactive dialogue systems to
test these ideas and to gain experience with them. Two examples are our
Circuit Fixit Shoppe and Programming Tutor systems. The Circuit Fixit
Shoppe was completed in 1991 and tested extensively as described in
references listed below. This system demonstrated many of the
characteristics described above and was successfully used by human subjects
in 141 problem solving sessions to find the bugs in and repair electric
circuits. The Programming Tutor is currently being constructed, has a much
cleaner and simpler design, and has full graphics and typed text
communication in addition to voice for a full multimedia capability.
Both systems have been tested extensively with human subjects with highly
success results. Success in problem solving was in the 80 percent plus
range, speaking rates were as high as several sentences per minute, sentence
recognition rates were in the 80's, and user subjective responses were very
positive.
PROJECT REFERENCES
- R. W. Smith, D. R. Hipp, and A. W. Biermann, "A Dialog Control Algorithm
and Its Performance," Proceedings of the Third Conference on Applied
Natural Language Processing, Trento, Italy, April 1-3, 1992.
- Alan W. Biermann, Curry I. Guinn, D. Richard Hipp, Ronnie W. Smith,
Efficient Collaborative Discourse: A Theory and its Implementation,
Proceedings of the ARPA Human Language Technology Workshop, Princeton,
N. J. March, 1993.
- Ronnie W. Smith and D. Richard Hipp, Spoken Natural Language Dialog
Systems, Oxford University Press, New York, 1994.
- Ronnie W. Smith, D. Richard Hipp, and Alan W. Biermann, "An
Architecture for
Voice Dialogue Systems Based on Prolog-Style Theorem Proving," Computational
Linguistics, Vol. 21, No. 3, September, 1995.
AREA BACKGROUND
This general area is extremely broad and includes fields related to every
stage of processing: speech recognition, parsing theory, semantics theory,
representation of knowledge, collaborative theory, dialogue theory,
natural language generation, speech generation, multimedia communication,
user modelling, and much more.
AREA REFERENCES
- Computational Linguistics, the journal.
- James Allen, Natural Language Understanding, Second Edition,
Benjamin/Cummings Publishing Company, Inc, 1994.
RELATED PROGRAM AREAS
- Virtual Environments
- Other Communication Modalities
- Adaptive Human Interfaces
- Usability and User-Centered Design
- Intelligent Interactive Systems for Persons with Disabilities
POTENTIAL RELATED PROJECTS
There are long lists of related projects including the following:
- the human factors of voice interactive problem solving systems
- learning for optimization of dialogue performance
- strategies for tutoring and their automation
- studies of multimedia systems