Washington University
Department of Computer Science
Campus Box 1045
St. Louis, MO 63130
314-935-6123
314-935-7302 (FAX)
This research is combining natural language processing and recurrent neural networks to determine the technical challenges and benefits of systems designed in this way. Specifically, this research is:
We have developed a large covering grammar of English and are refining the design of a connectionist parser. In our model, parsing is controlled by a recurrent neural network and attention-shifting rules are not required. The network is trained from processing traces taken from sample sentences and we have developed a sizable (>200) collection of simple and complex sentences for training and testing purposes. The output of each processing step is a structure-building action which symbolically constructs parts of the output structure when executed. The grammar has 75 rules, a small vocabulary, and a wide variety of sentence forms, including sentences with multiple secondary sentences. Sentences require between 15 and 90 processing steps to complete with an average of about 35 steps.
Recently our focus has been on:
Kalman, B.L., Kwasny, S.C., and Abella, A. Decomposing Input Patterns to Facilitate Training. Proceedings of the World Congress on Neural Networks, Volume III, (Portland, Oregon, July 11-15, 1993), 503-506.
Kalman, B.L., Kwasny, S.C. Why Tanh: Choosing a Sigmoidal Function. Proceedings of the International Joint Conference on Neural Networks, Vol. 4, (Baltimore, June, 1992), 578--581.
Kwasny, S.C., and Faisal, K.A. Connectionism and Determinism in a Syntactic Parser. Connection Science 2, 1-2, (1990), 63--82. Reprinted as Chapter 7 in Sharkey, Noel E. (Ed.), Connectionist Natural Language Processing, Intellect Publishers, UK, 1992, 119-138.
Kwasny, S.C., and Faisal, K.A. Symbolic Parsing via Sub--Symbolic Rules. Chapter 9 in Dinsmore, John (Ed.), Closing the Gap: Symbolism vs. Connectionism, Lawrence Erlbaum Associates, Hillsdale, NJ, 1992, 209--235.
Kwasny, S.C., Johnson, S., and Kalman, B.L. Recurrent Natural Language Parsing. Proceedings of the Sixteenth Annual Conference of the Cognitive Science Society, Lawrence Erlbaum Associates, Hillsdale, NJ, 1994, pp. 525-530.
Kwasny, S.C. and Kalman, B.L. Tail-Recursive Distributed Representations and Simple Recurrent Neural Networks. Connection Science 7, 1 (March, 1995), 61-80.
Kwasny, S.C., Kalman, B.L., and Chang, N. Distributed Patterns as Hierarchical Structures. Proceedings of the World Congress on Neural Networks, Volume II, (Portland, Oregon, July 11-15, 1993), 198-201.
McCann, P.J., and Kalman, B.L. Parallel Training of Simple Recurrent Neural Networks. Proceedings of the World Congress on Computational Intelligence (WCCI'94), Orlando, Florida, (June 26-July 2, 1994), Vol. 1, 167-170.
McCann, P.J., and Kalman, B.L. Batch Parallel Training of Simple Recurrent Neural Networks. Proceedings of the World Congress on Neural Networks (WCNN), San Diego, California, (June 4-9, 1994), Vol. III, 533-538.
While syntax is relatively well understood, challenges remain in producing processors that cover the range of sentences in a given language, are robust and efficient, and produce meaningful representations under all possible circumstances.
These areas are a focus of researchers building systems where language structure is learned from examples, either by using statistical techniques or by training a neural network. In our research and that of others, coverage has been shown to improve through learning. Robustness and efficiency can be addressed statistically and also, in our research, by neural networks. Finding representations and translating input sentences into those representations remains a central problem. What works in a small domain often does not scale well. Here again, statistical approaches and ones based on learning seem to work well.
Covington, Michael. Natural Language Processing for Prolog Programmers. Prentice-Hall, 1994.
Marcus, Mitchell. A Theory of Syntactic Recognition for Natural Language. MIT Press, 1980.
Miikkulainen, Risto. Sybsymbolic Natural Language Processing: An Integrated Model of Scripts, Lexicon, and Memory. MIT Press, 1993.
Weiss, S.M, and Kulikowski, C.A. Computer Systems that Learn. Morgan-Kauffman, 1991.