Learning Routines for Effective Off-Policy Reinforcement Learning
Edoardo Cetin, Oya Celiktutan

TL;DR
This paper introduces a novel framework for reinforcement learning that learns higher-level routines, enabling more efficient off-policy learning with fewer environment interactions and improved performance.
Contribution
The paper presents a new routine-based action space that is learned end-to-end, reducing the need for manual action design in off-policy reinforcement learning.
Findings
Performance improvements over state-of-the-art algorithms
Fewer environment interactions needed per episode
Enhanced computational efficiency
Abstract
The performance of reinforcement learning depends upon designing an appropriate action space, where the effect of each action is measurable, yet, granular enough to permit flexible behavior. So far, this process involved non-trivial user choices in terms of the available actions and their execution frequency. We propose a novel framework for reinforcement learning that effectively lifts such constraints. Within our framework, agents learn effective behavior over a routine space: a new, higher-level action space, where each routine represents a set of 'equivalent' sequences of granular actions with arbitrary length. Our routine space is learned end-to-end to facilitate the accomplishment of underlying off-policy reinforcement learning objectives. We apply our framework to two state-of-the-art off-policy algorithms and show that the resulting agents obtain relevant performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Software Engineering Research · Machine Learning and Data Classification
