NATURAL LANGUAGE PROCESSING AND CONNECTIONISM

Stan C. Kwasny

Washington University
Department of Computer Science
Campus Box 1045
St. Louis, MO 63130

CONTACT INFORMATION

sck@cs.wustl.edu

314-935-6123

314-935-7302 (FAX)

WWW PAGE

http://www.cs.wustl.edu/~sck/

PROGRAM AREA

Speech and Natural Language Understanding.

KEYWORDS

Connectionism, Deterministic Parsing, Natural Language Processing, Parsing, (Recurrent) Neural Network, Syntax

PROJECT SUMMARY

Processing human language requires flexibility. Useful syntactic structure can be extracted during processing if the principles and rules of language are applied carefully and delicately and without dictating policies of processing. Interaction with other knowledge sources is necessary to dynamically achieve consensus on irrevocable processing decisions. Decisions requiring postponement should affect the creation of representations that are ambiguous in precisely the ways under contention.

This research is combining natural language processing and recurrent neural networks to determine the technical challenges and benefits of systems designed in this way. Specifically, this research is:

investigating hybrid symbolic/subsymbolic architectures for systems that process natural language,
improving training techniques primarily for recurrent neural networks,
developing and studying representational schemes that support both traditional structures as well as ambiguity within those structures,
directly challenging the determinism hypothesis of Mitchell Marcus in creating recurrent networks that require information to be extracted from the sentence while deterministically processing left-to-right, and
determining the generalization capabilities of various connectionist architectures tested.

We have developed a large covering grammar of English and are refining the design of a connectionist parser. In our model, parsing is controlled by a recurrent neural network and attention-shifting rules are not required. The network is trained from processing traces taken from sample sentences and we have developed a sizable (>200) collection of simple and complex sentences for training and testing purposes. The output of each processing step is a structure-building action which symbolically constructs parts of the output structure when executed. The grammar has 75 rules, a small vocabulary, and a wide variety of sentence forms, including sentences with multiple secondary sentences. Sentences require between 15 and 90 processing steps to complete with an average of about 35 steps.

Recently our focus has been on:

finding better, more productive training techniques, including the use of intelligent hints,
developing a small semantic grammar to demonstrate deeper processing capabilities,
studying errors made during parsing to determine the nature of our architectures and to improve the abilities of the parser, and
evaluating the performance of our system.

PROJECT REFERENCES

Johnson, S., Kwasny, S.C., and Kalman, B.L. An Adaptive Neural Network Parser. In Dagli, Cihan, Burke, Laura I., Fernandez, Benito R., and Ghosh, Joydeep (Editors). Intelligent Engineering Systems through Artificial Neural Networks. Vol. 3, ASME Press, New York, 1993, 467-472.

Kalman, B.L., Kwasny, S.C., and Abella, A. Decomposing Input Patterns to Facilitate Training. Proceedings of the World Congress on Neural Networks, Volume III, (Portland, Oregon, July 11-15, 1993), 503-506.

Kalman, B.L., Kwasny, S.C. Why Tanh: Choosing a Sigmoidal Function. Proceedings of the International Joint Conference on Neural Networks, Vol. 4, (Baltimore, June, 1992), 578--581.

Kwasny, S.C., and Faisal, K.A. Connectionism and Determinism in a Syntactic Parser. Connection Science 2, 1-2, (1990), 63--82. Reprinted as Chapter 7 in Sharkey, Noel E. (Ed.), Connectionist Natural Language Processing, Intellect Publishers, UK, 1992, 119-138.

Kwasny, S.C., and Faisal, K.A. Symbolic Parsing via Sub--Symbolic Rules. Chapter 9 in Dinsmore, John (Ed.), Closing the Gap: Symbolism vs. Connectionism, Lawrence Erlbaum Associates, Hillsdale, NJ, 1992, 209--235.

Kwasny, S.C., Johnson, S., and Kalman, B.L. Recurrent Natural Language Parsing. Proceedings of the Sixteenth Annual Conference of the Cognitive Science Society, Lawrence Erlbaum Associates, Hillsdale, NJ, 1994, pp. 525-530.

Kwasny, S.C. and Kalman, B.L. Tail-Recursive Distributed Representations and Simple Recurrent Neural Networks. Connection Science 7, 1 (March, 1995), 61-80.

Kwasny, S.C., Kalman, B.L., and Chang, N. Distributed Patterns as Hierarchical Structures. Proceedings of the World Congress on Neural Networks, Volume II, (Portland, Oregon, July 11-15, 1993), 198-201.

McCann, P.J., and Kalman, B.L. Parallel Training of Simple Recurrent Neural Networks. Proceedings of the World Congress on Computational Intelligence (WCCI'94), Orlando, Florida, (June 26-July 2, 1994), Vol. 1, 167-170.

McCann, P.J., and Kalman, B.L. Batch Parallel Training of Simple Recurrent Neural Networks. Proceedings of the World Congress on Neural Networks (WCNN), San Diego, California, (June 4-9, 1994), Vol. III, 533-538.

AREA BACKGROUND

Language, whether written or spoken, is a fundamental part of human communication. Any hope of providing computer systems that can claim intelligence approaching that of a human, therefore, rests on the hope of providing communication in natural language. However, while important advances are being made and numerous constrained-domain systems have been constructed, many fundamental issues remain unresolved.

While syntax is relatively well understood, challenges remain in producing processors that cover the range of sentences in a given language, are robust and efficient, and produce meaningful representations under all possible circumstances.

These areas are a focus of researchers building systems where language structure is learned from examples, either by using statistical techniques or by training a neural network. In our research and that of others, coverage has been shown to improve through learning. Robustness and efficiency can be addressed statistically and also, in our research, by neural networks. Finding representations and translating input sentences into those representations remains a central problem. What works in a small domain often does not scale well. Here again, statistical approaches and ones based on learning seem to work well.

AREA REFERENCES

Allen, James. Natural Language Understanding. Benjamin/Cummings, 1995.

Covington, Michael. Natural Language Processing for Prolog Programmers. Prentice-Hall, 1994.

Marcus, Mitchell. A Theory of Syntactic Recognition for Natural Language. MIT Press, 1980.

Miikkulainen, Risto. Sybsymbolic Natural Language Processing: An Integrated Model of Scripts, Lexicon, and Memory. MIT Press, 1993.

Weiss, S.M, and Kulikowski, C.A. Computer Systems that Learn. Morgan-Kauffman, 1991.

RELATED PROGRAM AREAS

3. Other Communication Modalities.

POTENTIAL RELATED PROJECTS

Communication should include both written and spoken forms of language as well as "extra-grammatical" forms of expression. An integrated project, one which attempts to effectively combine these various forms of expression, would be an interesting challenge.