Excluding the Irrelevant: Focusing Reinforcement Learning through   Continuous Action Masking

Roland Stolz; Hanna Krasowski; Jakob Thumm; Michael Eichelbeck,; Philipp Gassert; Matthias Althoff

arXiv:2406.03704·cs.LG·November 6, 2024

Excluding the Irrelevant: Focusing Reinforcement Learning through Continuous Action Masking

Roland Stolz, Hanna Krasowski, Jakob Thumm, Michael Eichelbeck,, Philipp Gassert, Matthias Althoff

PDF

Open Access 1 Video

TL;DR

This paper introduces three continuous action masking methods in reinforcement learning to focus on relevant actions based on state, improving training efficiency, safety, and performance in control tasks.

Contribution

The paper proposes novel state-dependent action masking techniques for continuous RL, enhancing learning efficiency and safety by restricting actions to relevant subsets.

Findings

01

Higher final rewards with masking methods

02

Faster convergence compared to baseline

03

Effective in safety-critical applications

Abstract

Continuous action spaces in reinforcement learning (RL) are commonly defined as multidimensional intervals. While intervals usually reflect the action boundaries for tasks well, they can be challenging for learning because the typically large global action space leads to frequent exploration of irrelevant actions. Yet, little task knowledge can be sufficient to identify significantly smaller state-specific sets of relevant actions. Focusing learning on these relevant actions can significantly improve training efficiency and effectiveness. In this paper, we propose to focus learning on the set of relevant actions and introduce three continuous action masking methods for exactly mapping the action space to the state-dependent set of relevant actions. Thus, our methods ensure that only relevant actions are executed, enhancing the predictability of the RL agent and enabling its use in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Excluding the Irrelevant: Focusing Reinforcement Learning through Continuous Action Masking· slideslive

Taxonomy

TopicsEmbodied and Extended Cognition

MethodsSparse Evolutionary Training · Focus