Learning Routines for Effective Off-Policy Reinforcement Learning

Edoardo Cetin; Oya Celiktutan

arXiv:2106.02943·cs.LG·June 8, 2021·1 cites

Learning Routines for Effective Off-Policy Reinforcement Learning

Edoardo Cetin, Oya Celiktutan

PDF

Open Access 1 Video

TL;DR

This paper introduces a novel framework for reinforcement learning that learns higher-level routines, enabling more efficient off-policy learning with fewer environment interactions and improved performance.

Contribution

The paper presents a new routine-based action space that is learned end-to-end, reducing the need for manual action design in off-policy reinforcement learning.

Findings

01

Performance improvements over state-of-the-art algorithms

02

Fewer environment interactions needed per episode

03

Enhanced computational efficiency

Abstract

The performance of reinforcement learning depends upon designing an appropriate action space, where the effect of each action is measurable, yet, granular enough to permit flexible behavior. So far, this process involved non-trivial user choices in terms of the available actions and their execution frequency. We propose a novel framework for reinforcement learning that effectively lifts such constraints. Within our framework, agents learn effective behavior over a routine space: a new, higher-level action space, where each routine represents a set of 'equivalent' sequences of granular actions with arbitrary length. Our routine space is learned end-to-end to facilitate the accomplishment of underlying off-policy reinforcement learning objectives. We apply our framework to two state-of-the-art off-policy algorithms and show that the resulting agents obtain relevant performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Learning Routines for Effective Off-Policy Reinforcement Learning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Software Engineering Research · Machine Learning and Data Classification