Knowledge Discovery in Networks

October 26, 2005

2:50 pm - 4:00 pm

Halligan 111

Abstract

Networks are an increasing common method of representing the relationships among sets of interacting entities. This basic data structure is reflected in how we analyze and understand social networks, networks of scholarly citations and web pages, and networks of computers and communications devices. Over the past five years, my students and I have developed a number of methods for learning statistical models of networks that can make accurate predictions about the attributes of nodes in the network and also provide insight into the broad structure of statistical dependencies among different types of nodes. These models build on methods developed previously in statistics, machine learning, and knowledge discovery, including Bayesian networks and probability estimation trees. We have applied these techniques to a wide variety of problems, including citation analysis and fraud detection. We have also developed an open-source software environment incorporating our tools for statistical modeling and ad hoc querying of relational data.

David Jensen is Associate Professor of Computer Science and Director of the Knowledge Discovery Laboratory at the University of Massachusetts Amherst. From 1991 to 1995, he served as an analyst with the Office of Technology Assessment, an agency of the United States Congress. He received his doctorate from Washington University in 1992. His current research focuses on machine learning and knowledge discovery in relational data, with applications to intelligence analysis, social network analysis, and fraud detection. He serves on the editorial boards of Machine Learning and the Journal of Artificial Intelligence Research, and on the program committees of the International Conference on Machine Learning and the International Conference on Knowledge Discovery and Data Mining.