Research Talk: Probabilistic Generative Models with Application to Astronomy

November 19, 2009
Halligan 111B
Speaker: Yuyang Wang, Tufts University


Machine learning techniques have been applied to many fields of scientific inquiry. In some areas, classification error can be tolerated while in other domains misclassification costs are high. In such cases it makes sense to abstain and not predict the class for some instances. For example, published star catalogs in astronomy are required to have high fidelity, and it is preferable to leave some stars to be classified domain experts than to publish wrong categorization. Another issue that arises in astronomy is class discovery. Typical classification approaches assume each instance in test set belong to one of the predefined classes; however, test data might contain examples that originate from classes that have not been defined for the given domain.

To solve the two problems, we aim to build a probabilistic generative model for astronomy data to model both the data from predefined classes and unknown classes explicitly, and to estimate their corresponding probability density, based on mixture model in our setting. Our model aims to address problems involving the known and unknown data in 1) classification with abstention; 2) automatic new class discovery. Furthermore, our ultimate goal is to improve the automated system for classifying astronomy data, i.e. boosting the performance in period detection and classification accuracy.