Graduate Research Talk: A Recommender Systems Approach to Imputations of Drug Profiles in Connectivity Mapping

October 31, 2018
11:00 AM
Halligan 209
Speaker: Rebecca Newman
Host: Donna Slonim

Abstract

Finding new uses for known drugs, referred to as "drug repurposing," reduces the duration and cost of the drug discovery process. However, the search space for hypothesis generation is vast and has not been experimentally investigated. We seek to improve this process by imputing experimental data, specifically within the context of the Connectivity Map (CMap) database. CMap stores information in the form of signatures, or vectors of real-valued gene expression levels, which show which genes the cell is using to make proteins, for a cell when exposed to a drug. Though CMap provides functionality to generate hypotheses about relationships between drugs and cells, clinically relevant matches may be overlooked due to missing signature information. Recommender systems are appealing to accurately impute unknown signatures, as they have been proven to handle large, sparse rating matrices. We modified and evaluated recommenders with nearest neighbor and SVD approaches for accuracy in prediction of both raw signatures and CMap query response. We compare these approaches to baseline averaging and other imputation algorithms such as Group Factor Analysis (GFA).