On the Equilibrium between Feasible Zone and Uncertain Model in Safe Exploration
Yujie Yang, Zhilong Zheng, Shengbo Eben Li

TL;DR
This paper introduces a novel safe exploration framework in reinforcement learning that balances expanding the feasible exploration zone with reducing model uncertainty, converging to an equilibrium for safer and more effective exploration.
Contribution
It presents the first equilibrium-oriented safe exploration framework, SEE, which iteratively refines the model and feasible zone to achieve optimal safe exploration.
Findings
SEE monotonically refines the uncertain model
Feasible zones expand monotonically
Algorithm converges to safe exploration equilibrium
Abstract
Ensuring the safety of environmental exploration is a critical problem in reinforcement learning (RL). While limiting exploration to a feasible zone has become widely accepted as a way to ensure safety, key questions remain unresolved: what is the maximum feasible zone achievable through exploration, and how can it be identified? This paper, for the first time, answers these questions by revealing that the goal of safe exploration is to find the equilibrium between the feasible zone and the environment model. This conclusion is based on the understanding that these two components are interdependent: a larger feasible zone leads to a more accurate environment model, and a more accurate model, in turn, enables exploring a larger zone. We propose the first equilibrium-oriented safe exploration framework called safe equilibrium exploration (SEE), which alternates between finding the maximum…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robotic Path Planning Algorithms · Robot Manipulation and Learning
