Graduate Research Talk: Recommendation Systems for Hydrologic Data

October 30, 2018
12:00 PM
Halligan 209
Speaker: Zhaokun Xue
Host: Alva Couch

Abstract

Recommendation systems have become a powerful feature for many online systems, including electronic commerce, video or music applications, news websites, etc. Most current recommendation systems are designed for increasing user satisfaction during use of media services. Collaborative filtering systems deal with large crowds of people and sparse ratings of objects, while content-based recommendation systems use text categorization and classification, or handle objects with relatively large attribute spaces. Recommendation systems for scientific data are targeted at data reuse and scientific progress. A recommendation system for scientific data must handle objects with really large and complex sets of attributes, so that content-based filtering is appropriate. The context of this work is to establish a content-based recommendation system for hydrologic data on the HydroShare platform for water data sharing. The process for making scientific data recommendations includes inferring users’ interests from the datasets with which they interact, and then calculating similarity between users’ interests and datasets’ subject lists. Evaluating such a system requires comparing datasets that are recommended with the datasets that users have already shown interest in using. Evaluation of the recommendation system -- based upon a year of user activity records -- indicates that accounting for 6 days of user activity leads to the most accurate recommendations.