Leveraging Constraint Violation Signals For Action-Constrained   Reinforcement Learning

Janaka Chathuranga Brahmanage; Jiajing Ling; Akshat Kumar

arXiv:2502.10431·cs.LG·February 18, 2025

Leveraging Constraint Violation Signals For Action-Constrained Reinforcement Learning

Janaka Chathuranga Brahmanage, Jiajing Ling, Akshat Kumar

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel approach for action-constrained reinforcement learning that uses constraint violation signals to train normalizing flows, reducing violations and improving efficiency compared to previous methods.

Contribution

The paper proposes a new method that trains normalizing flows with constraint violation signals, avoiding the need for generating feasible actions and extending to state-wise constraints.

Findings

01

Significantly fewer constraint violations in control tasks.

02

Achieves comparable or better control performance.

03

Simplifies learning by eliminating the need for feasible action samples.

Abstract

In many RL applications, ensuring an agent's actions adhere to constraints is crucial for safety. Most previous methods in Action-Constrained Reinforcement Learning (ACRL) employ a projection layer after the policy network to correct the action. However projection-based methods suffer from issues like the zero gradient problem and higher runtime due to the usage of optimization solvers. Recently methods were proposed to train generative models to learn a differentiable mapping between latent variables and feasible actions to address this issue. However, generative models require training using samples from the constrained action space, which itself is challenging. To address such limitations, first, we define a target distribution for feasible actions based on constraint violation signals, and train normalizing flows by minimizing the KL divergence between an approximated distribution…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rlr-smu/cv-flow
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsNormalizing Flows