Search Engines for Human Language Technology Applications

December 7, 2005
2:50 pm - 4:00 pm
Halligan 111
Speaker: Jamie Callan, CMU


In the last decade text search has become common. Everyone uses it. Although people complain about search quality, it's good enough for average people to routinely search billions of documents in a few seconds, and to usually find what they need. Given this success, is text search is still an interesting research topic?

Several recent trends suggest that text retrieval research is in the midst of an important transition in its approaches to representation and statistical inference. Information Retrieval is also broadening its view of "the user" to include question answering systems, reading tutors, and other human language technologies (HLT) applications capable of describing their requirements very precisely. These trends suggest that our view of text retrieval may be very different in the coming years.


Jamie Callan is an Associate Professor at the Language Technologies Institute, a graduate Computer Science department at Carnegie Mellon University. Jamie's research group studies full-text information retrieval, including retrieval of structured (XML) documents, full-text search in peer-to-peer networks, text retrieval for reading and ESL/ELL tutors, and text mining in large public comment repositories. His group and Bruce Croft's group at UMass, Amherst develop and distribute Lemur , an open-source system for text retrieval research. Jamie's earlier IR research studied architectures for large-scale information retrieval and adaptive filtering systems, and first generation Web-search systems.