Efficient Imitation Learning with Conservative World Models

Victor Kolev; Rafael Rafailov; Kyle Hatch; Jiajun Wu; Chelsea Finn

arXiv:2405.13193·cs.LG·August 19, 2024

Efficient Imitation Learning with Conservative World Models

Victor Kolev, Rafael Rafailov, Kyle Hatch, Jiajun Wu, Chelsea Finn

PDF

Open Access

TL;DR

This paper introduces a conservative world model approach for imitation learning from expert demonstrations, addressing distributional shift issues and achieving state-of-the-art results in high-dimensional manipulation tasks without reward labels.

Contribution

It reformulates imitation learning as a fine-tuning problem with a conservative optimization bound, improving stability and performance over prior model-based methods.

Findings

01

Achieved state-of-the-art performance on Franka Kitchen environment.

02

Reduced sample complexity by using synthetic data from learned world models.

03

Demonstrated effectiveness on complex dexterity manipulation tasks.

Abstract

We tackle the problem of policy learning from expert demonstrations without a reward function. A central challenge in this space is that these policies fail upon deployment due to issues of distributional shift, environment stochasticity, or compounding errors. Adversarial imitation learning alleviates this issue but requires additional on-policy training samples for stability, which presents a challenge in realistic domains due to inefficient learning and high sample complexity. One approach to this issue is to learn a world model of the environment, and use synthetic data for policy training. While successful in prior works, we argue that this is sub-optimal due to additional distribution shifts between the learned model and the real environment. Instead, we re-frame imitation learning as a fine-tuning problem, rather than a pure reinforcement learning one. Drawing theoretical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Human Motion and Animation

MethodsSparse Evolutionary Training