Robotics Institute and Human-Computer Interaction Institute
Carnegie Mellon University
FAX: (412) 268-6298
The proposed case study of prosody focuses on an educational task that combines intrinsic national importance with compelling methodological advantages. The proposed educational task is to "coach" children's oral reading -- that is, display text on the screen, listen to a child read it aloud, detect the child's mistakes, decide when and how to intervene, and provide help and encouragement. The proposed research builds on the code, data, and experience gained from a working prototype of such a coach, developed with prior NSF support.
The proposed research will focus on improving four aspects of dialogue -- taking turns, handling speech repairs, preventing dialogue breakdown, and modelling the speaker. In the context of the reading task, these aspects include detecting a number of pedagogically significant events, such as when readers complete a passage, correct themselves, or encounter difficulty in identifying a word or comprehending a passage. The proposed research will use prosodic cues to help detect these events in order to make the dialogue between student and coach more effective in achieving its educational objectives.
Expected outcomes include not only improvements in the reading coach, but more generally the discovery of robust prosodic phenomena, methods for detecting them, and principles for using them to improve spoken communication so as to better accomplish the task at hand. This work will lay essential foundations for using prosody to achieve graceful, effective spoken dialogue between humans and computers.
[Mostow et al 93] J. Mostow, A. G. Hauptmann, L. L. Chase, and S. Roth. Towards a Reading Coach that Listens: Automated Detection of Oral Reading Errors. In Proceedings of the Eleventh National Conference on Artificial Intelligence (AAAI93), pages 392-397. American Association for Artificial Intelligence, Washington, DC, July, 1993.
[Mostow et al 94a] J. Mostow, S. Roth, A. Hauptmann, M. Kane, A. Swift, L. Chase, and B. Weide. A reading coach that listens: (edited) video transcript. In Proceedings of the Twelfth National Conference on Artificial Intelligence (AAAI94), pages 1507. Seattle, WA, August, 1994.
[Mostow et al 94b] J. Mostow, S. Roth, A. G. Hauptmann, and M. Kane. A Prototype Reading Coach that Listens. In Proceedings of the Twelfth National Conference on Artificial Intelligence (AAAI-94), pages 785-792. American Association for Artificial Intelligence, Seattle, WA, August, 1994. Recipient of the AAAI-94 Outstanding Paper Award.
[Mostow et al 94c] J. Mostow, S. Roth, A. Hauptmann, M. Kane, A. Swift, L. Chase, and B. Weide. A Reading Coach that Listens (6-minute video). In Video Track of the Twelfth National Conference on Artificial Intelligence (AAAI94). American Association for Artificial Intelligence, Seattle, WA, August, 1994.
[Mostow et al 95] J. Mostow, A. Hauptmann, and S. Roth. Demonstration of a Reading Coach that Listens. In Proceedings of the Eighth Annual Symposium on User Interface Software and Technology. Sponsored by ACM SIGGRAPH and SIGCHI in cooperation with SIGSOFT, Pittsburgh, PA, November, 1995.
The coach is designed to provide a combination of reading and listening, in which the child reads whenever possible, and the coach helps whenever necessary, so as to provide a pleasant, successful reading experience. The coach's assistance, modelled after expert reading teachers, is intended to support word identification, comprehension, and motivation.
[Barr et al 91] R. Barr, M. L. Kamil, P. B. Mosenthal, and P. D. Pearson. Handbook of reading research. Longman Publishing Group, 95 Church Street, White Plains, NY 10601, 1991. ISBN 0-8013-0292-7.
[Huang et al 93] X. D. Huang, F. Alleva, H. W. Hon, M. Y. Hwang, K. F. Lee, and R. Rosenfeld. The SPHINX-II speech recognition system: An overview. Computer Speech and Language 7(2):137-148, April, 1993.
[NCES 93a] National Center for Education Statistics. NAEP 1992 Reading Report Card for the Nation and the States: Data from the National and Trial State Assessments. Technical Report Report No. 23-ST06, U.S. Department of Education, Washington, DC, September, 1993.
[NCES 93b] National Center for Education Statistics. Adult Literacy in America. Technical Report GPO 065-000-00588-3, U.S. Department of Education, Washington, DC, September, 1993.
[OTA 93] Office of Technology Assessment. Adult Literacy and New Technologies: Tools for a Lifetime. Technical Report OTA-SET-550, U.S. Congress, Washington, DC, July, 1993.
[Pearson 84] P. D. Pearson (editor). Handbook of Reading Research. Longman Publishing Group, New York, 1984. ISBN 0-582-28119-9.
[Pierrehumbert and Hirschberg 90] J. Pierrehumbert and J. Hirschberg. The meaning of intonational contours in the interpretation of discourse. In P. Cohen, J. Morgan, and M. Pollack (editors), Intentions and Plans in Communication and Discourse. MIT Press, 1990.
[USCD 91] USCD. Closing the Literacy Gap in American Business. Technical Report, United States Commerce Department, 1991.
[Waibel and Lee 90] A. Waibel and K.-F. Lee. Readings in Speech Recognition. Morgan Kaufmann, San Mateo, CA, 1990.
4. Adaptive Human Interfaces.
5. Usability and User-Centered Design.
6. Intelligent Interactive Systems for Persons with Disabilities.