A Simple Reward-free Approach to Constrained Reinforcement Learning

Sobhan Miryoosefi; Chi Jin

arXiv:2107.05216·cs.LG·July 13, 2021·5 cites

A Simple Reward-free Approach to Constrained Reinforcement Learning

Sobhan Miryoosefi, Chi Jin

PDF

Open Access

TL;DR

This paper introduces a straightforward method that leverages reward-free reinforcement learning to efficiently solve constrained RL problems, including in tabular and linear function approximation settings, with minimal additional complexity.

Contribution

It presents a simple meta-algorithm that uses reward-free RL oracles to address constrained RL, providing sharp sample complexity bounds and extending to Markov games.

Findings

01

Achieves near-optimal sample complexity for constrained RL in tabular MDPs.

02

Extends the approach to tabular two-player Markov games.

03

Provides new results for constrained RL with linear function approximation.

Abstract

In constrained reinforcement learning (RL), a learning agent seeks to not only optimize the overall reward but also satisfy the additional safety, diversity, or budget constraints. Consequently, existing constrained RL solutions require several new algorithmic ingredients that are notably different from standard RL. On the other hand, reward-free RL is independently developed in the unconstrained literature, which learns the transition dynamics without using the reward information, and thus naturally capable of addressing RL with multiple objectives under the common dynamics. This paper bridges reward-free RL and constrained RL. Particularly, we propose a simple meta-algorithm such that given any reward-free RL oracle, the approachability and constrained RL problems can be directly solved with negligible overheads in sample complexity. Utilizing the existing reward-free RL solvers, our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Supply Chain and Inventory Management