On the Geometry of Reinforcement Learning in Continuous State and Action Spaces
Saket Tiwari, Omer Gottesman, George Konidaris

TL;DR
This paper introduces a geometric perspective on reinforcement learning in continuous spaces, proving that the reachable state manifold's dimension is bounded by the action space's dimension plus one, and demonstrates this with empirical and algorithmic results.
Contribution
It provides the first theoretical link between state space geometry and action space dimensionality in continuous reinforcement learning, supported by empirical validation and a new learning algorithm.
Findings
The reachable state manifold has at most the action space dimension plus one.
Empirical validation in four MuJoCo environments supports the theoretical bound.
A new algorithm effectively learns low-dimensional policies with comparable or better performance.
Abstract
Advances in reinforcement learning have led to its successful application in complex tasks with continuous state and action spaces. Despite these advances in practice, most theoretical work pertains to finite state and action spaces. We propose building a theoretical understanding of continuous state and action spaces by employing a geometric lens. Central to our work is the idea that the transition dynamics induce a low dimensional manifold of reachable states embedded in the high-dimensional nominal state space. We prove that, under certain conditions, the dimensionality of this manifold is at most the dimensionality of the action space plus one. This is the first result of its kind, linking the geometry of the state space to the dimensionality of the action space. We empirically corroborate this upper bound for four MuJoCo environments. We further demonstrate the applicability of our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Adam · Convolution · Dense Connections · Experience Replay · Weight Decay · Batch Normalization · Deep Deterministic Policy Gradient
