Equivariant Offline Reinforcement Learning
Arsh Tangri, Ondrej Biza, Dian Wang, David Klee, Owen Howell, Robert, Platt

TL;DR
This paper explores the use of $SO(2)$-equivariant neural networks in offline reinforcement learning for robotic manipulation, demonstrating improved performance in low-data scenarios by leveraging symmetry properties.
Contribution
It introduces the integration of $SO(2)$-equivariant neural networks into offline RL algorithms, showing their advantage over non-equivariant methods in data-limited robotic tasks.
Findings
Equivariant CQL and IQL outperform non-equivariant versions.
Equivariance enhances offline RL in low-data regimes.
Empirical results validate the benefit of symmetry-aware networks.
Abstract
Sample efficiency is critical when applying learning-based methods to robotic manipulation due to the high cost of collecting expert demonstrations and the challenges of on-robot policy learning through online Reinforcement Learning (RL). Offline RL addresses this issue by enabling policy learning from an offline dataset collected using any behavioral policy, regardless of its quality. However, recent advancements in offline RL have predominantly focused on learning from large datasets. Given that many robotic manipulation tasks can be formulated as rotation-symmetric problems, we investigate the use of -equivariant neural networks for offline RL with a limited number of demonstrations. Our experimental results show that equivariant versions of Conservative Q-Learning (CQL) and Implicit Q-Learning (IQL) outperform their non-equivariant counterparts. We provide empirical evidence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsElevator Systems and Control · Adaptive Dynamic Programming Control
MethodsQ-Learning
