Bayesian Nonparametric Approaches to Reinforcement Learning in Partially Observable Domains
Making sequential decisions without complete information is important for many applications, and the field of partially observable reinforcement learning (PORL) provides a formal framework for designing agents that improve their decision-making with experience. Unfortunately, approaches to PORL have had limited success in realworld applications. We suggest that Bayesian nonparametric methods, by providing flexible ways to build models and incorporate expert information, can address many of the issues with current methods.
In this talk, I will present two Bayesian nonparametric approaches to PORL. The first is a model-based approach that posits that the world may have an infinite number of underlying states, but the agent is likely to spend most of its time in small (finite) subset of these states---and thus, those are the only states that need to be well- modelled. We'll then show that the same machinery can be utilized if we believe that an optimal state controller for an environment might require an infinite number of nodes, but only a small (finite) subset will be needed for most operations. I will derive the models and demonstrate how they address key issues with previous PORL approaches.
Finale Doshi-Velez is a PhD candidate at MIT's Computer Science and Artificial Intelligence Laboratory. Prior to that, she completed her MSc at the University of Cambridge under a Marshall Fellowship. Her research focuses on the intersection of statistical machine learning and decision-making under uncertainty.