To operate effectively in complex environments learning agents require the ability to form useful abstractions, that is, the ability to selectively ignore irrelevant details. Stated in general terms this is a very difficult problem. Much of the work in this field is specialized to specific modeling paradigms or classes of problems. In this thesis we introduce an abstraction framework for Markov decision processes (MDPs) based on homomorphisms relating MDPs. We build on classical finite-state automata literature and develop a minimization framework for MDPs that can exploit structure and symmetries to derive smaller equivalent models of the problem. Since employing homomorphisms for minimization requires that the resulting abstractions be exact, we introduce approximate and partial homomorphisms and develop bounds for the loss that results from employing relaxed abstraction criteria.
Our MDP minimization results can be readily employed by reinforcement learning (RL) methods for forming abstractions. We extend our abstraction approach to hierarchical RL, specifically using the options framework. We introduce relativized options, a generalization of Markov sub-goal options, that allow us to define options without an absolute frame of reference. We introduce an extension to the options framework, based on relativized options, that allows us to learn simultaneously at multiple levels of the hierarchy and also employ hierarchy-specific abstractions. We provide certain theoretical guarantees regarding the performance of hierarchical systems that employ approximate abstraction. We empirically demonstrate the utility of relativized options in several test-beds.
Relativized options can also be interpreted as behavioral schemas. We demonstrate that such schemas can be profitably employed in a hierarchical RL setting. We also develop algorithms that learn the appropriate parameter binding to a given schema. We empirically demonstrate the validity and utility of these algorithms. Relativized options allow us to model certain aspects of deictic or indexical representations. We develop a modification of our parameter binding algorithm suited to hierarchical RL architectures that employ deictic representations.