Safe Planning and Policy Optimization via World Model Learning
Artem Latyshev, Gregory Gorbov, Aleksandr I. Panov

TL;DR
This paper introduces a novel model-based reinforcement learning framework that enhances safety and performance in real-world tasks by adaptively switching between planning and policy execution, addressing model errors and safety constraints.
Contribution
It proposes an adaptive mechanism for switching between planning and policy execution, and employs dynamic safety thresholds to improve safety and performance in model-based RL.
Findings
Significant safety and performance improvements over non-adaptive methods
Robust performance on diverse safety-critical continuous control tasks
Outperforms existing methods in safety-critical settings
Abstract
Reinforcement Learning (RL) applications in real-world scenarios must prioritize safety and reliability, which impose strict constraints on agent behavior. Model-based RL leverages predictive world models for action planning and policy optimization, but inherent model inaccuracies can lead to catastrophic failures in safety-critical settings. We propose a novel model-based RL framework that jointly optimizes task performance and safety. To address world model errors, our method incorporates an adaptive mechanism that dynamically switches between model-based planning and direct policy execution. We resolve the objective mismatch problem of traditional model-based approaches using an implicit world model. Furthermore, our framework employs dynamic safety thresholds that adapt to the agent's evolving capabilities, consistently selecting actions that surpass safe policy suggestions in both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · AI-based Problem Solving and Planning
