Geometry of Drifting MDPs with Path-Integral Stability Certificates
Zuyuan Zhang, Mahdi Imani, Tian Lan

TL;DR
This paper introduces a geometric framework for analyzing nonstationary reinforcement learning environments by modeling the environment as a differentiable path, leading to stability certificates and adaptive algorithms that improve tracking in dynamic settings.
Contribution
It presents a novel geometric perspective on nonstationary MDPs, deriving stability bounds and feasible regions, and develops HT-RL and HT-MCTS algorithms that adaptively respond to environment changes.
Findings
Improved tracking and regret in oscillatory regimes
Effective online estimation of environment complexity
Enhanced stability guarantees for nonstationary RL
Abstract
Real-world reinforcement learning is often \emph{nonstationary}: rewards and dynamics drift, accelerate, oscillate, and trigger abrupt switches in the optimal action. Existing theory often represents nonstationarity with coarse-scale models that measure \emph{how much} the environment changes, not \emph{how} it changes locally -- even though acceleration and near-ties drive tracking error and policy chattering. We take a geometric view of nonstationary discounted Markov Decision Processes (MDPs) by modeling the environment as a differentiable homotopy path and tracking the induced motion of the optimal Bellman fixed point. This yields a length-curvature-kink signature of intrinsic complexity: cumulative drift, acceleration/oscillation, and action-gap-induced nonsmoothness. We prove a solver-agnostic path-integral stability bound and derive gap-safe feasible regions that certify local…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Autonomous Vehicle Technology and Safety
