A Simple Reward-free Approach to Constrained Reinforcement Learning
Sobhan Miryoosefi, Chi Jin

TL;DR
This paper introduces a straightforward method that leverages reward-free reinforcement learning to efficiently solve constrained RL problems, including in tabular and linear function approximation settings, with minimal additional complexity.
Contribution
It presents a simple meta-algorithm that uses reward-free RL oracles to address constrained RL, providing sharp sample complexity bounds and extending to Markov games.
Findings
Achieves near-optimal sample complexity for constrained RL in tabular MDPs.
Extends the approach to tabular two-player Markov games.
Provides new results for constrained RL with linear function approximation.
Abstract
In constrained reinforcement learning (RL), a learning agent seeks to not only optimize the overall reward but also satisfy the additional safety, diversity, or budget constraints. Consequently, existing constrained RL solutions require several new algorithmic ingredients that are notably different from standard RL. On the other hand, reward-free RL is independently developed in the unconstrained literature, which learns the transition dynamics without using the reward information, and thus naturally capable of addressing RL with multiple objectives under the common dynamics. This paper bridges reward-free RL and constrained RL. Particularly, we propose a simple meta-algorithm such that given any reward-free RL oracle, the approachability and constrained RL problems can be directly solved with negligible overheads in sample complexity. Utilizing the existing reward-free RL solvers, our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Supply Chain and Inventory Management
