Reinforcement Learning with Convex Constraints

Sobhan Miryoosefi; Kiant\'e Brantley; Hal Daum\'e III; Miroslav Dudik,; Robert Schapire

arXiv:1906.09323·cs.LG·January 29, 2021·32 cites

Reinforcement Learning with Convex Constraints

Sobhan Miryoosefi, Kiant\'e Brantley, Hal Daum\'e III, Miroslav Dudik,, Robert Schapire

PDF

Open Access 1 Repo

TL;DR

This paper introduces a flexible reinforcement learning framework that incorporates convex constraints on expected measurements, enabling safer, more diverse, and more expert-like behaviors with theoretical guarantees.

Contribution

It proposes a general algorithmic scheme for constrained RL that handles convex expected-value constraints, extending previous safety-focused methods to new properties like diversity.

Findings

01

Matches existing safety constraint algorithms in performance

02

Enforces new properties such as diversity

03

Applicable to model-free and model-based RL

Abstract

In standard reinforcement learning (RL), a learning agent seeks to optimize the overall reward. However, many key aspects of a desired behavior are more naturally expressed as constraints. For instance, the designer may want to limit the use of unsafe actions, increase the diversity of trajectories to enable exploration, or approximate expert trajectories when rewards are sparse. In this paper, we propose an algorithmic scheme that can handle a wide class of constraints in RL tasks: specifically, any constraints that require expected values of some vector measurements (such as the use of an action) to lie in a convex set. This captures previously studied constraints (such as safety and proximity to an expert), but also enables new classes of constraints (such as diversity). Our approach comes with rigorous theoretical guarantees and only relies on the ability to approximately solve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xkianteb/ApproPO
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Machine Learning and Algorithms