Look Beneath the Surface: Exploiting Fundamental Symmetry for Sample-Efficient Offline RL
Peng Cheng, Xianyuan Zhan, Zhihao Wu, Wenjia Zhang, Shoucheng Song,, Han Wang, Youfang Lin, Li Jiang

TL;DR
This paper introduces TSRL, a novel offline RL algorithm that leverages time-reversal symmetry in system dynamics to improve data efficiency and performance on small datasets, outperforming existing methods.
Contribution
The paper proposes TDM and TSRL, utilizing fundamental symmetry to enhance offline RL, especially with limited data, and introduces a new reliability measure for out-of-distribution samples.
Findings
TSRL performs well with as little as 1% of original data.
Leveraging symmetry improves representations and out-of-distribution detection.
TSRL outperforms recent offline RL algorithms in small data regimes.
Abstract
Offline reinforcement learning (RL) offers an appealing approach to real-world tasks by learning policies from pre-collected datasets without interacting with the environment. However, the performance of existing offline RL algorithms heavily depends on the scale and state-action space coverage of datasets. Real-world data collection is often expensive and uncontrollable, leading to small and narrowly covered datasets and posing significant challenges for practical deployments of offline RL. In this paper, we provide a new insight that leveraging the fundamental symmetry of system dynamics can substantially enhance offline RL performance under small datasets. Specifically, we propose a Time-reversal symmetry (T-symmetry) enforced Dynamics Model (TDM), which establishes consistency between a pair of forward and reverse latent dynamics. TDM provides both well-behaved representations for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNeural Networks and Reservoir Computing · Reinforcement Learning in Robotics · Model Reduction and Neural Networks
