Discovering Disease Subtypes that Improve Treatment Predictions: Interpretable Machine Learning for Personalized Medicine
Abstract
For complex diseases like depression, choosing a successful treatment
from several possible drugs remains a trial-and-error process in
current clinical practice. By applying statistical machine learning to
the electronic health records of thousands of patients, can we
discover subtypes of disease which both improve population-wide
understanding and improve patient-specific drug recommendations? One
popular approach is to represent noisy, high-dimensional health
records as mixtures of low-dimensional subtypes via a probabilistic
topic model. I will introduce this common dimensionality reduction
method and explain how off-the-shelf topic models are misspecified for
downstream prediction tasks across many domains from text analysis to
healthcare. To overcome these poor predictions, I will introduce a new
framework -- prediction-constrained training -- which learns
interpretable topic models that offer competitive drug
recommendations. I will also discuss open challenges in using machine
learning to improve clinical decision-making.
Bio
Michael C. Hughes ("Mike") is currently a postdoctoral fellow in computer science at Harvard University, where he develops new machine learning methods for healthcare applications with Prof. Finale Doshi-Velez. His current research focus is helping clinicians understand and treat complex diseases like depression by training statistical models from big, messy electronic health record datasets. Other research interests include Bayesian data analysis, optimization algorithms, and any machine learning applications that advance medicine and the sciences. He completed a Ph.D. in the Department of Computer Science at Brown University in May 2016, advised by Prof. Erik Sudderth. You can find his papers and code on the web at www.michaelchughes.com.