On the Equilibrium between Feasible Zone and Uncertain Model in Safe Exploration

Yujie Yang; Zhilong Zheng; Shengbo Eben Li

arXiv:2602.00636·cs.LG·February 5, 2026

On the Equilibrium between Feasible Zone and Uncertain Model in Safe Exploration

Yujie Yang, Zhilong Zheng, Shengbo Eben Li

PDF

Open Access

TL;DR

This paper introduces a novel safe exploration framework in reinforcement learning that balances expanding the feasible exploration zone with reducing model uncertainty, converging to an equilibrium for safer and more effective exploration.

Contribution

It presents the first equilibrium-oriented safe exploration framework, SEE, which iteratively refines the model and feasible zone to achieve optimal safe exploration.

Findings

01

SEE monotonically refines the uncertain model

02

Feasible zones expand monotonically

03

Algorithm converges to safe exploration equilibrium

Abstract

Ensuring the safety of environmental exploration is a critical problem in reinforcement learning (RL). While limiting exploration to a feasible zone has become widely accepted as a way to ensure safety, key questions remain unresolved: what is the maximum feasible zone achievable through exploration, and how can it be identified? This paper, for the first time, answers these questions by revealing that the goal of safe exploration is to find the equilibrium between the feasible zone and the environment model. This conclusion is based on the understanding that these two components are interdependent: a larger feasible zone leads to a more accurate environment model, and a more accurate model, in turn, enables exploring a larger zone. We propose the first equilibrium-oriented safe exploration framework called safe equilibrium exploration (SEE), which alternates between finding the maximum…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robotic Path Planning Algorithms · Robot Manipulation and Learning