Quals research talk: Human Centered Model Selection Through Visual Analytics

October 27, 2017
Halligan 209
Speaker: Dylan Cashman, Tufts University
Host: Remco Chang


Model selection is a classic machine learning task to choose an algorithm and its parameters in order to optimize some objective function. Objective functions generally decompose into two parts - a measure of loss on the training set that drives our model to match the training distribution, and some regularizing penalty that drives our model to generalize well to heldout testing data. However, such objective functions don't take into account how those models are used. A model used by an automated machine vs a model used to inform intervention in a patient's health car plan have wildly different constraints and costs. This results in a bias towards black box models that increasingly do not fit the wide usage scenarios of data science in the big data world.

We propose a formulation of model selection that takes into account the human using the chosen model by adding a human penalty into our objective function. We begin by offering examples of data analysis tasks that don't fit into the classic model selection formulation. We then highlight three recent and ongoing projects in which visual analytics is used to drive model selection that is cognizant of the human penalty. These projects cover classical machine learning models like classification and regression as well as deep learning.