Efficient Imitation Learning with Conservative World Models
Victor Kolev, Rafael Rafailov, Kyle Hatch, Jiajun Wu, Chelsea Finn

TL;DR
This paper introduces a conservative world model approach for imitation learning from expert demonstrations, addressing distributional shift issues and achieving state-of-the-art results in high-dimensional manipulation tasks without reward labels.
Contribution
It reformulates imitation learning as a fine-tuning problem with a conservative optimization bound, improving stability and performance over prior model-based methods.
Findings
Achieved state-of-the-art performance on Franka Kitchen environment.
Reduced sample complexity by using synthetic data from learned world models.
Demonstrated effectiveness on complex dexterity manipulation tasks.
Abstract
We tackle the problem of policy learning from expert demonstrations without a reward function. A central challenge in this space is that these policies fail upon deployment due to issues of distributional shift, environment stochasticity, or compounding errors. Adversarial imitation learning alleviates this issue but requires additional on-policy training samples for stability, which presents a challenge in realistic domains due to inefficient learning and high sample complexity. One approach to this issue is to learn a world model of the environment, and use synthetic data for policy training. While successful in prior works, we argue that this is sub-optimal due to additional distribution shifts between the learned model and the real environment. Instead, we re-frame imitation learning as a fine-tuning problem, rather than a pure reinforcement learning one. Drawing theoretical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Human Motion and Animation
MethodsSparse Evolutionary Training
