Offline Imitation Learning upon Arbitrary Demonstrations by Pre-Training Dynamics Representations
Haitong Ma, Bo Dai, Zhaolin Ren, Yebin Wang, Na Li

TL;DR
This paper introduces a pre-training approach that learns dynamics representations from arbitrary data to improve offline imitation learning, enabling effective policy mimicry with minimal expert demonstrations.
Contribution
It proposes a novel pre-training method that leverages dynamics representations to enhance offline IL, especially under limited data, and provides theoretical justification for its effectiveness.
Findings
Can mimic expert policies with as few as one trajectory.
Leverages pre-trained dynamics from simulator data for real-world tasks.
Improves offline IL performance with limited demonstrations.
Abstract
Limited data has become a major bottleneck in scaling up offline imitation learning (IL). In this paper, we propose enhancing IL performance under limited expert data by introducing a pre-training stage that learns dynamics representations, derived from factorizations of the transition dynamics. We first theoretically justify that the optimal decision variable of offline IL lies in the representation space, significantly reducing the parameters to learn in the downstream IL. Moreover, the dynamics representations can be learned from arbitrary data collected with the same dynamics, allowing the reuse of massive non-expert data and mitigating the limited data issues. We present a tractable loss function inspired by noise contrastive estimation to learn the dynamics representations at the pre-training stage. Experiments on MuJoCo demonstrate that our proposed algorithm can mimic expert…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Human Pose and Action Recognition
