Norman I. Badler
Mark Steedman

Computer and Information Science Department
200 South 33rd St.
University of Pennsylvania
Philadelphia, PA 19104-6389


215-898-5862 (Badler)
215-573-7453 fax

215-898-2012 (Steedman)
215-898-0587 fax



Other Communication Modalities.


Communicating agents, gesture, facial animation, dialog planning, simulated humans


The goal of this research is to develop a system which automatically generates and animates conversations between multiple cooperative agents with appropriate and synchronized speech, intonation, facial expressions, and hand gestures. The research is based on theory which addresses relations and coordinations between these channels. The significance of this research is to provide a 3D computer animation testbed for cooperative conversation theories. Human-machine interaction and training systems need more interactive and cooperative synthetic agents. Conversations are created by a dialogue planner that produces the text as well as the intonation of the utterances. The speaker/listener relationship, the content of text, the intonation and the undertaken actions all determine facial expressions, lip motions, eye gaze, head motion, and arm gesture generators. This project will focus on domains in which agents must propose and agree on abstract plans, and may have to motivate and carry out physical actions and refer to objects in their physical environment during conversation.


"Animated conversation: rule-based generation of facial expression, gesture and spoken intonation for multiple conversational agents," Computer Graphics, pp. 413-420, July 1994. (J. Cassell, C. Pelachaud, N. Badler, M. Steedman, B. Achorn, W. Becket, B. Douville, S. Prevost, and M. Stone.)

Simulating Humans: Computer Graphics, Animation, and Control, Oxford University Press, June 1993 (N. I. Badler, C. B. Phillips and B. L. Webber).

"The Center for Human Modeling and Simulation," Presence, the Journal of Virtual Reality and Teleoperators 4(1), pp. 81-96, 1995. (N. Badler, D. Metaxas, B. Webber, and M. Steedman).

"Generating facial expressions for speech," Cognitive Science, to appear, 1995. (C. Pelachaud, N. Badler, and M. Steedman).

Planning for animation. In N. M-Thalmann and D. Thalmann (eds.), Interactive Computer Animation, Prentice-Hall, 1995 (N. Badler, B. Webber, W. Becket, C. Geib, M. Moore, C. Pelachaud, B. Reich, and M. Stone)

Simulating humans in VR. In R. Earnshaw, H. Jones, and J. Vince (eds.), Virtual Reality and its Applications, Academic Press, London, UK, 1995 (J. Granieri and N. Badler).


This work draws on research from the areas of facial animation, attention, object manipulation, response generation (including speech, intonation, and gesture synthesis), and planning and reasoning about actions and conversation.

Several approaches to facial animation are reviewed in (Pelachaud et al. 1994). Our system automatically generates muscle movements based on sequences of phonemes and other content-related information. The system uses similar content-related information to generate communicative manual gestures (McNeil 1992, Tuite 93).

In order to be believable, simulated agents must direct gaze and deploy visual attention appropriately. Our work attempts to create a model of visual attention and gaze interaction (Argyle et al. 1976, Ballard et al. 1992, Kahneman 1973).

Synthetic agents that manipulate objects in convincing ways also need multi-layered action planners which bridge the gap between the power of general-purpose AI planning, and robotic and animation systems which have concentrated on robust performance issues. Such intermediate planners use object-specific knowledge to decompose actions and generate action parameters in order to mediate between abstract plans and the specifics of the performance (Levison and Badler 1994).

These planners must integrate planning and action, interleave acting with sensing and attention (Ballard et al. 1992), and determine when to act to acquire further knowledge. This includes treating speech acts in the same way as physical actions. It is the specification of communicative acts by the planner that determines all aspects of the utterance (Prevost and Steedman 1994). Rather than using pre-existing text, a representation of speech, including intonational and gestural markers, is automatically generated based on the semantic content determined by the planner. The semantics includes the relation of the utterance to other utterances in the discourse (Pierrehumbert and Hirschberg, 1990).


Gaze and Mutual Gaze, Cambridge University Press, 1976 (M. Arglyle, M. Cook)

"Hand-eye Coordination During Sequential Tasks", Philosophical Transactions of the Royal Society of London, V337, p. 331-339, September 29, 1992 (D. Ballard, et. al.)

Attention and Effort, Prentice-Hall, 1973 (D. Kahneman)

"How Animated Agents Perform Tasks: Connecting Planning and Manipulation Through Object-Specific Reasoning," Presented at the AAAI Spring Symposium: Toward Physical Interaction and Manipulation, March 1994 (L. Levison and N. Badler)

Hand and Mind: What Gestures Reveal about Thought, University of Chicago, 1992 (David McNeil)

"Final Report to NSF of the Standards for Facial Animation Workshop", October, 1994 (C. Pelachaud, N. Badler, M. Viaud)

"The meaning of Intonational Contours in the Interpretation of Discourse", Intentions in Communication ed. by P. Cohen, J. Morgan and M. Pollock, 1990, (J. Pierrehumbert and J. Hirschberg).

"Specifying Intonation from Context for Speech Synthesis" Speech Communication, 15, 139-153, 1994, (Mark Steedman and Scott Prevost).

"The Production of Gesture", Semiotica, 93(1/2), 1993 (K. Tuite)


1. Virtual Environments.

2. Speech and Natural Language Understanding.

4. Adaptive Human Interfaces.

6. Intelligent Interactive Systems for Persons with Disabilities.


Communication with simulated humans in virtual environments.

Cross-modality communication for persons with disabilities (e.g. animating gestures given only speech).