SafeDreamer: Safe Reinforcement Learning with World Models

Weidong Huang; Jiaming Ji; Chunhe Xia; Borong Zhang; Yaodong Yang

arXiv:2307.07176·cs.LG·August 9, 2024

SafeDreamer: Safe Reinforcement Learning with World Models

Weidong Huang, Jiaming Ji, Chunhe Xia, Borong Zhang, Yaodong Yang

PDF

Open Access 1 Repo 1 Models 3 Reviews

TL;DR

SafeDreamer is a new reinforcement learning algorithm that effectively balances safety and performance by integrating world models with Lagrangian methods, achieving near-zero safety violations in complex tasks.

Contribution

It introduces SafeDreamer, combining world models with Lagrangian safety constraints within the Dreamer framework for improved safe RL performance.

Findings

01

Achieves nearly zero-cost safety performance on Safety-Gymnasium tasks.

02

Effective in both low-dimensional and vision-only input scenarios.

03

Demonstrates superior safety-performance trade-offs compared to existing methods.

Abstract

The deployment of Reinforcement Learning (RL) in real-world applications is constrained by its failure to satisfy safety criteria. Existing Safe Reinforcement Learning (SafeRL) methods, which rely on cost functions to enforce safety, often fail to achieve zero-cost performance in complex scenarios, especially vision-only tasks. These limitations are primarily due to model inaccuracies and inadequate sample efficiency. The integration of the world model has proven effective in mitigating these shortcomings. In this work, we introduce SafeDreamer, a novel algorithm incorporating Lagrangian-based methods into world model planning processes within the superior Dreamer framework. Our method achieves nearly zero-cost performance on various tasks, spanning low-dimensional and vision-only input, within the Safety-Gymnasium benchmark, showcasing its efficacy in balancing performance and safety…

Peer Reviews

Decision·ICLR 2024 poster

Reviewer 01Rating 6· marginally above the acceptance thresholdConfidence 4

Strengths

The method is straightforward and is promising. Using Lagrangian-based method is proven to be effective but you will need to set a cost threshold and in test-time the cost will be around the threshold when the agent is fully trained. But in this work, by using the world model and trained cost estimator the agent can achieve much higher safety performance due to the internal planning in the latent space.

Weaknesses

The paper proposes OSPR, OSRP-Lag and BSRP-Lag but the differences between them is not well motivated and presented. In Section 3.2, it seems the sampled trajectories contain two parts, one set of the trajs are deduced from a Normal action distribution and another set is from current policy. This part is confusing and not well presented. The notation is not aligned between SafeDreamer paragraph and the Algorithm 1. I would recommend rephrase the method part and mark the Algorithm line number i

Reviewer 02Rating 6· marginally above the acceptance thresholdConfidence 5

Strengths

- The paper is mostly well-written and investigates an important problem. The proposed approach of combining DreamerV3 with CCEM and Lagrangian SafeRL methods is novel, interesting, and very well-motivated. - I especially liked the impressive empirical results showing that SafeDreamer is able to achieve near-zero cost returns while still having good reward returns. - The paper shows numerous empirical results comparing SafeDreamer with many relevant model-based and model-free baselines, in a n

Weaknesses

**MAJOR** 1- **Contributions**: - SafeDreamer (OSRP, OSRP-Lag, and BSRP-Lag) seems like a straightforward combination of prior works (Dreamerv3, CCEM, Augmented Lagrangian, and PID Lagrangian). For example, if I understand correctly, OSRP (Sec 3.2) is exactly CCEM but using the TD($\lambda$) objectives from DreamerV3 (for the cost and rewards estimates). It is not clear what are nuances that make this combination not as straightforward as one would expect. No theory nor empirical analysis is

Reviewer 03Rating 8· accept, good paperConfidence 5

Strengths

1. This is a well-written paper and I enjoyed reading it. Though some claims need further assessment, It is easy to follow in general. The paper provides a detailed explanation of the SafeDreamer algorithm, including its components and integration methods. The inclusion of algorithms, figures, and experimental results demonstrates a high level of technical rigor and quality in the research. 2. SafeDreamer effectively balances long-term rewards and costs, a crucial aspect of SafeRL. The ability

Weaknesses

1. While the paper compares SafeDreamer with several existing SafeRL methods, it lacks a comprehensive comparison with a broader range of related work in the field. A more extensive comparison could provide a clearer context for SafeDreamer's contributions and limitations. For example, the authors noticed the failure of achieving zero cost of Lagrangian methods and CPO and concluded the reason behind it is the lack of long-horizon planning. Actually there are several literatures [1, 2] that have

Code & Models

Repositories

pku-alignment/safedreamer
jaxOfficial

Models

🤗
Weidong-Huang/SafeDreamer
model· ♡ 2
♡ 2

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications

Methodsfail