Research Talk: Anomaly Detection with Relevance Feedback

April 14, 2010

1:30pm-2:30pm

Halligan 127

Speaker: Saeed Majidi

Host:

Abstract

Anomaly or outlier detection seeks to detect patterns in a given data set that do not conform to expected behavior. For the majority of both supervised and unsupervised anomaly detection methods, points are ranked with respect to an anomaly score that is based on a distance measure defined over the features describing each instance in the dataset. The domain expert can then examine this ranked list of anomalies. A potential flaw in these approaches is that existing methods weight all features equally in computing the score for each data point. But this has a significant drawback: a data point may only be anomalous in some of the features but the normal features will dominate the distance calculation. In this talk we present a method that takes relevance feedback from the expert in order to determine a set of weights on the features for the distance metric. Specifically, at each iteration, we use an active learning approach to choose and present K points to the expert, she then indicates whether or not they are “interesting” anomalies. From this feedback our method recalculates the weights for the features with the goal of ensuring that more anomalous points appear higher in the ranking on the next iteration. Our research is motivated by the problem of finding anomalous objects in astronomy photometric time series and we show preliminary results of our method on this dataset.