Operator-Guided Invariance Learning for Continuous Reinforcement Learning
Zuyuan Zhang, Fei Xu Yu, Tian Lan

TL;DR
This paper introduces VPSD-RL, a method that discovers value-preserving structures in continuous RL using Lie-group actions, enhancing data efficiency and robustness.
Contribution
It proposes a novel framework that models continuous RL with Lie-group based value-preserving mappings, enabling discovery of exact and approximate invariances.
Findings
Improved data efficiency on continuous-control benchmarks
Enhanced robustness under nuisance variability and shifts
Theoretical guarantees for stability with approximate structures
Abstract
Reinforcement learning (RL) with continuous time and state/action spaces is often data-intensive and brittle under nuisance variability and shift, motivating methods that exploit value-preserving structures to stabilize and improve learning. Most existing approaches focus on special cases, such as prescribed symmetries and exact equivariance, without addressing how to discover more general structures that require nonlinear operators to transform and map between continuous state/action systems with isomorphic value functions. We propose \textbf{VPSD-RL} (Value-Preserving Structure Discovery for Reinforcement Learning). It models continuous RL as a controlled diffusion with value-preserving mappings defined through Lie-group actions and associated pullback operators. We show that a value-preserving structure exists exactly when pulling back the value function and pushing forward actions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
