Safe and Balanced: A Framework for Constrained Multi-Objective   Reinforcement Learning

Shangding Gu; Bilgehan Sel; Yuhao Ding; Lu Wang; Qingwei Lin; Alois; Knoll; Ming Jin

arXiv:2405.16390·cs.AI·May 28, 2024

Safe and Balanced: A Framework for Constrained Multi-Objective Reinforcement Learning

Shangding Gu, Bilgehan Sel, Yuhao Ding, Lu Wang, Qingwei Lin, Alois, Knoll, Ming Jin

PDF

Open Access 1 Repo

TL;DR

This paper introduces a primal-based framework for safe multi-objective reinforcement learning that balances multiple goals with safety constraints, using a novel natural policy gradient method and providing theoretical guarantees.

Contribution

It proposes a new natural policy gradient manipulation technique for constrained multi-objective RL and establishes convergence and safety guarantees.

Findings

01

Outperforms prior methods on challenging safe RL tasks

02

Provides theoretical convergence and constraint violation guarantees

03

Effectively balances multiple objectives with safety constraints

Abstract

In numerous reinforcement learning (RL) problems involving safety-critical systems, a key challenge lies in balancing multiple objectives while simultaneously meeting all stringent safety constraints. To tackle this issue, we propose a primal-based framework that orchestrates policy optimization between multi-objective learning and constraint adherence. Our method employs a novel natural policy gradient manipulation method to optimize multiple RL objectives and overcome conflicting gradients between different tasks, since the simple weighted average gradient direction may not be beneficial for specific tasks' performance due to misaligned gradients of different task objectives. When there is a violation of a hard constraint, our algorithm steps in to rectify the policy to minimize this violation. We establish theoretical convergence and constraint violation guarantees in a tabular…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

SafeRL-Lab/Safe-Multi-Objective-MuJoCo
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics