Model-Based Proactive Cost Generation for Learning Safe Policies Offline with Limited Violation Data
Ruiqi Xue, Lei Yuan, Kainuo Cheng, Jing-Wen Yang, Yang Yu

TL;DR
This paper introduces PROCO, a model-based offline safe reinforcement learning framework that leverages language models to estimate risks and improve safety in scenarios with limited unsafe data.
Contribution
PROCO uniquely integrates language knowledge into model-based offline RL to estimate risks without observed violations, enhancing safety in high-stakes tasks.
Findings
PROCO reduces constraint violations across Safety-Gymnasium tasks.
It effectively synthesizes unsafe samples for better feasibility estimation.
PROCO outperforms baseline methods in safety performance.
Abstract
Learning constraint-satisfying policies from offline data without risky online interaction is crucial for safety-critical decision making. Conventional methods typically learn cost value functions from abundant unsafe samples to define safety boundaries and penalize violations. However, in high-stakes scenarios, risky trial-and-error is infeasible, yielding datasets with few or no unsafe samples. Under this limitation, existing approaches often treat all data as uniformly safe, overlooking safe-but-infeasible states - states that currently satisfy constraints but inevitably violate them within a few steps - leading to deployment failures. Drawing inspiration from the concept of knowledge-data integration, we leverage large language models (LLMs) to incorporate natural language knowledge into the policy to address this challenge. Specifically, we propose PROCO, a model-based offline safe…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
