Structuring Value Representations via Geometric Coherence in Markov Decision Processes
Zuyuan Zhang, Zeyu Fang, Tian Lan

TL;DR
This paper introduces GCR-RL, a novel reinforcement learning framework that leverages geometric coherence and order theory to improve stability and sample efficiency in value function estimation.
Contribution
It recasts value functions as posets and develops algorithms for super-poset refinement, integrating geometric coherence into RL training.
Findings
GCR-RL improves sample efficiency significantly.
It achieves more stable performance compared to baselines.
Theoretical analysis confirms convergence properties.
Abstract
Geometric properties can be leveraged to stabilize and speed reinforcement learning. Existing examples include encoding symmetry structure, geometry-aware data augmentation, and enforcing structural restrictions. In this paper, we take a novel view of RL through the lens of order theory and recast value function estimates into learning a desired poset (partially ordered set). We propose \emph{GCR-RL} (Geometric Coherence Regularized Reinforcement Learning) that computes a sequence of super-poset refinements -- by refining posets in previous steps and learning additional order relationships from temporal difference signals -- thus ensuring geometric coherence across the sequence of posets underpinning the learned value functions. Two novel algorithms by Q-learning and by actor--critic are developed to efficiently realize these super-poset refinements. Their theoretical properties and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Advanced Graph Neural Networks
