# The Complexity of Model Minimization

## Abstract

There are many phenomena in engineering, manufacturing, science, and social science that can be modeled as controlled stochastic systems with rewards or costs. There are well-understood techniques for finding optimal control policies for such models, based on linear programming. Unfortunately, these techniques are not practical if the state space is huge. For instance, if the model is given in terms of features (or state variables), where the number of states is exponential in the size of the representation, the problem becomes intractable. One approach to reducing complexity is to group together states that "act (approximately) the same," namely, have (approximately) the same rewards or costs, and (approximately) the same transition probabilities. There are algorithms available for working with the resulting "Bounded parameter MDP" (BMDP). that act approximately the same is NP^{PP}-complete for factored or succinctly represented MDPs. (I'll also talk about the class NP^{PP}, which is probably not a familiar one to most people.) Thus, without additional highly restrictive constraints on the problem, it seems to be an impractical work-around for the problem of large state spaces. I will present at least one set of restrictions on the models that makes the construction of BMDPs polynomial-time. I will also mention some heuristic development that may lead to heuristics for constructing probably-correct BMDPs.