OASIS: Conditional Distribution Shaping for Offline Safe Reinforcement   Learning

Yihang Yao; Zhepeng Cen; Wenhao Ding; Haohong Lin; Shiqi Liu; Tingnan; Zhang; Wenhao Yu; Ding Zhao

arXiv:2407.14653·cs.LG·July 23, 2024·2 cites

OASIS: Conditional Distribution Shaping for Offline Safe Reinforcement Learning

Yihang Yao, Zhepeng Cen, Wenhao Ding, Haohong Lin, Shiqi Liu, Tingnan, Zhang, Wenhao Yu, Ding Zhao

PDF

Open Access 1 Repo

TL;DR

OASIS introduces a novel offline safe RL method using a conditional diffusion model to synthesize datasets, improving safety, data efficiency, and performance in constrained environments.

Contribution

The paper proposes OASIS, a new paradigm employing conditional diffusion models to shape offline datasets for enhanced safe reinforcement learning.

Findings

01

Outperforms existing methods on public benchmarks.

02

Achieves high reward and safety compliance.

03

Demonstrates robustness and data efficiency.

Abstract

Offline safe reinforcement learning (RL) aims to train a policy that satisfies constraints using a pre-collected dataset. Most current methods struggle with the mismatch between imperfect demonstrations and the desired safe and rewarding performance. In this paper, we introduce OASIS (cOnditionAl diStributIon Shaping), a new paradigm in offline safe RL designed to overcome these critical limitations. OASIS utilizes a conditional diffusion model to synthesize offline datasets, thus shaping the data distribution toward a beneficial target domain. Our approach makes compliance with safety constraints through effective data utilization and regularization techniques to benefit offline safe RL training. Comprehensive evaluations on public benchmarks and varying datasets showcase OASIS's superiority in benefiting offline safe RL agents to achieve high-reward behavior while satisfying the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yihangyao/OASIS
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Reinforcement Learning in Robotics

MethodsDiffusion · OASIS