Tufts ML Alumni
Current location: Tufts University
Associated Publications: [+]
Authors: U. Rebbapragada, R. Lomasky, C. E. Brodley and M. Friedl
Proceedings of the 2008 IEEE International Geoscience and Remote Science Symposium
Authors: R. Lomasky, C. E. Brodley, M. Aernecke, D. Walt, and M. Friedl
Abstract: This paper presents Active Class Selection (ACS), a new class of problems for multi-class supervised learning. If one can control the classes from which training data is generated, utilizing feedback during learning to guide the generation of new training data will yield better performance than learning from any a priori fixed class distribution. ACS is the process of iteratively selecting class proportions for data generation. In this paper we present several methods for ACS. In an empirical evaluation, we show that for a fixed number of training instances, methods based on increasing class stability outperform methods that seek to maximize class accuracy or that use random sampling. Finally we present results of a deployed system for our motivating application: training an artificial nose to discriminate vapors.
Authors: R. Lomasky, C. E. Brodley, S. Bencic, M. Aernecke, and D. Walt
NIPS Workshop: Testing of Deployable Learning and Decision Systems
Abstract: For some supervised learning tasks, researchers can control the data generation process. In such cases, it would be beneﬁcial to have feedback during learning to guide future data collection. Our research is motivated by a real-world problem: discrimination of vapors with an “artiﬁcial nose”. The nose’s accuracy is vital, because it will be deployed to detect harmful gases in critical situations, such as an airport or a subway. We address how to improve accuracy if insufﬁcient examples have been observed to accurately deﬁne the class’s decision boundaries. This problem differs from situations in which active learning is applicable. Active learning either requests labels for existing data or explicitly queries the feature space. In contrast, our task allows us to ask for additional examples from speciﬁc classes. In this paper we propose an adaptive heuristic to identify from which classes instances should be added during the learning process. We evaluate our methods on the artiﬁcial nose data and show signiﬁcant improvement over random sampling.
Current Research Topics:
Past Research Topics: [+]
Description: We are looking at problems related to the generation of training data. We are interested in two scenarios. 1) A new class of problems we have defined, Active Class Selection (ACS). ACS addresses the question: if one can collect n additional instances, how should they be distributed with respect to class? 2) Active Learning, in which one requests labels for existing training data.
Specifically, Active Class Selection addresses the tasks for which one can control the classes from which training data are generated. In such cases, utilizing feedback during learning to guide the generation of new training data will yield better performance than learning from an a priori fixed class distribution. Our methods work within a multi-armed bandit framework.
In regard to active learning, we are investigating several real-world issues. Speficially, how to perform active learning in the context of severe class imbalance, how to adapt to changes in the underlying concept to be learned (concept drift), and how to inject domain knowledge into the AL framework.