Toward a Theory of Metacontrol for Dialogue Systems

Alan W. Biermann
Department of Computer Science
Box 90129, Duke University
Durham, NC 27708-0129.

Email :
WWW Page :


Speech and Natural Language Understanding.


Voice interactive systems, human-machine collaboration, multimedia systems, user modelling, dialogue theory, human factors.


The Need

When humans collaborate with each other, they undertake a variety of behaviors that enable fast and efficient convergence to the goal. First, each participant is continuously involved in mental problem solving and the interactions are the result of the internal processing. When one participant believes he or she has reached a point where communication will enhance progress, they send a message to the other individuals. They do not start at the beginning and follow a strategy of comprehensive communication. Instead, they abruptly jump to key issues in the problem solving situation and address them aggressively. At unpredictable places, they drop the current discussion and jump to some other seemingly relevant issue either resolving it or dropping it also to jump to yet another or to some previous discussion. This light-footed jumping from issue to issue as the problem solving scenario goes forward, enables the participants to attack sub problems as quickly as their minds can create them and to deal with them effectively so other issues can be addressed.

The effectiveness of the interaction is dependent on many additional characteristics. For example, each communication from one participant to the other must carefully account for the other's knowledge level. Utterances must start from a knowledge level within the other's grasp and move in the direction of communicating the target fact. Another important feature is that the conversational initiative must go to the participant most able to direct progress. The initiative must pass back and forth between them as one participant or another becomes the most appropriate leader in subsequent interactions. Yet another important feature is that so called meta-statements need to be uttered from time to time to mark progress, to announce successes, and to direct next steps. These interactions help maintain morale and keep participant efforts synchronized.

The Theory

The tremendous effectiveness and efficiency of human-human collaborations can be approximated in human-machine collaborations to the extent that the above characteristics can be duplicated. Our project is developing a theory for the machine participant that enables these capabilities to some extent. Specifically, our system represents knowledge in a Prolog-like language and it does theorem proving to attempt to solve problems. Our strategy is to build the capabilities described above into our machine collaborator by augmenting the Prolog theorem proving mechanisms. For example, the theorem proving mechanism attempts to solve the problem at hand using its internal mechanisms. It does not initiate communication until a problem arises where some part of the proof cannot be completed. If it determines from the user model that the other participant may be able to contribute to the currently failing sub proof, it will execute a very focused interaction with the user to attempt to find how to finish that sub proof. The attempt at the sub proof can result in diversions to other sub proofs which when solved enable its own success. The associated interactions are the sub dialogues that are so common in human-human interactions.

The Prolog style theorem proving provides an excellent environment for examining and to some extent solving the other issues of dialogue. Thus there are simple and natural means for encoding a user model into the knowledge base so that interactions are limited to user appropriate communications. It is possible to judge where initiative should be taken or released to the other and to properly lead or follow in each specific situation. It is possible to devise a theory of meta-statements and to appropriately assert them or receive and process them.

The Implementation and Results

Our project has implemented several voice interactive dialogue systems to test these ideas and to gain experience with them. Two examples are our Circuit Fixit Shoppe and Programming Tutor systems. The Circuit Fixit Shoppe was completed in 1991 and tested extensively as described in references listed below. This system demonstrated many of the characteristics described above and was successfully used by human subjects in 141 problem solving sessions to find the bugs in and repair electric circuits. The Programming Tutor is currently being constructed, has a much cleaner and simpler design, and has full graphics and typed text communication in addition to voice for a full multimedia capability.

Both systems have been tested extensively with human subjects with highly success results. Success in problem solving was in the 80 percent plus range, speaking rates were as high as several sentences per minute, sentence recognition rates were in the 80's, and user subjective responses were very positive.



This general area is extremely broad and includes fields related to every stage of processing: speech recognition, parsing theory, semantics theory, representation of knowledge, collaborative theory, dialogue theory, natural language generation, speech generation, multimedia communication, user modelling, and much more.




There are long lists of related projects including the following: