Reinforcement Learning with Almost Sure Constraints

Agustin Castellano; Hancheng Min; Juan Bazerque; Enrique Mallada

arXiv:2112.05198·cs.LG·February 14, 2023·1 cites

Reinforcement Learning with Almost Sure Constraints

Agustin Castellano, Hancheng Min, Juan Bazerque, Enrique Mallada

PDF

Open Access

TL;DR

This paper introduces a novel approach for finding feasible policies in Constrained Markov Decision Processes using a scalar budget variable, enabling almost sure constraint satisfaction and improving policy search efficiency.

Contribution

It proposes a new class of policies with a budget variable, analyzes the Bellman-like operator for minimal budget computation, and provides learning methods with sample complexity bounds.

Findings

01

Minimal budget can be computed as the smallest fixed point of a Bellman-like operator.

02

The approach ensures almost sure constraint satisfaction, unlike expectation-based constraints.

03

Simulations demonstrate the effectiveness of the method in constrained policy optimization.

Abstract

In this work we address the problem of finding feasible policies for Constrained Markov Decision Processes under probability one constraints. We argue that stationary policies are not sufficient for solving this problem, and that a rich class of policies can be found by endowing the controller with a scalar quantity, so called budget, that tracks how close the agent is to violating the constraint. We show that the minimal budget required to act safely can be obtained as the smallest fixed point of a Bellman-like operator, for which we analyze its convergence properties. We also show how to learn this quantity when the true kernel of the Markov decision process is not known, while providing sample-complexity bounds. The utility of knowing this minimal budget relies in that it can aid in the search of optimal or near-optimal policies by shrinking down the region of the state space the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics