Learning Parameterized Policies for Planning Annotated RL
Abstract
Recently, several approaches have utilized AI planning in the context of hierarchical reinforcement learning. These methods employ planning operator descriptions to establish options for acquiring primitive or low-level skills. By employing hierarchical decomposition through operators, these approaches offer notable benefits during training, such as enhanced sample efficiency, as well as during evaluation, with improved generalization across different yet related tasks. In this study, we introduce a novel approach for defining parameterized options using operator descriptions. Our empirical evaluations conducted on the mini-grid domain demonstrate that the proposed approach not only enhances sample efficiency but also overcomes certain limitations associated with generalization capabilities.