AutoCost: Evolving Intrinsic Cost for Zero-violation Reinforcement   Learning

Tairan He; Weiye Zhao; Changliu Liu

arXiv:2301.10339·cs.LG·January 26, 2023·1 cites

AutoCost: Evolving Intrinsic Cost for Zero-violation Reinforcement Learning

Tairan He, Weiye Zhao, Changliu Liu

PDF

Open Access 1 Video

TL;DR

AutoCost introduces an automatic method to find intrinsic cost functions that enable constrained reinforcement learning algorithms to achieve zero constraint violations while maintaining competitive performance.

Contribution

The paper proposes AutoCost, a framework that automatically searches for intrinsic cost functions to improve safety in constrained RL, achieving zero violations.

Findings

01

AutoCost finds cost functions that lead to zero violations in Safety Gym.

02

Intrinsic costs enable policies to satisfy safety constraints without performance loss.

03

The method outperforms baseline approaches in safety benchmarks.

Abstract

Safety is a critical hurdle that limits the application of deep reinforcement learning (RL) to real-world control tasks. To this end, constrained reinforcement learning leverages cost functions to improve safety in constrained Markov decision processes. However, such constrained RL methods fail to achieve zero violation even when the cost limit is zero. This paper analyzes the reason for such failure, which suggests that a proper cost function plays an important role in constrained RL. Inspired by the analysis, we propose AutoCost, a simple yet effective framework that automatically searches for cost functions that help constrained RL to achieve zero-violation performance. We validate the proposed method and the searched cost function on the safe RL benchmark Safety Gym. We compare the performance of augmented agents that use our cost function to provide additive intrinsic costs with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

AutoCost: Evolving Intrinsic Cost for Zero-violation Reinforcement Learning· underline

Taxonomy

TopicsReinforcement Learning in Robotics

Methodsfail