Reward-free World Models for Online Imitation Learning

Shangzhe Li; Zhiao Huang; Hao Su

arXiv:2410.14081·cs.LG·May 13, 2025

Reward-free World Models for Online Imitation Learning

Shangzhe Li, Zhiao Huang, Hao Su

PDF

Open Access 1 Repo

TL;DR

This paper introduces a reward-free world model approach for online imitation learning that models environment dynamics in latent space, improving stability and performance in complex high-dimensional tasks.

Contribution

It proposes a novel method using latent space dynamics and inverse soft-Q learning to enhance stability and effectiveness in online imitation learning for complex environments.

Findings

01

Achieves stable, expert-level performance in high-dimensional tasks

02

Outperforms existing methods on benchmarks like DMControl, MyoSuite, and ManiSkill2

03

Demonstrates the effectiveness of reward-free latent dynamics modeling

Abstract

Imitation learning (IL) enables agents to acquire skills directly from expert demonstrations, providing a compelling alternative to reinforcement learning. However, prior online IL approaches struggle with complex tasks characterized by high-dimensional inputs and complex dynamics. In this work, we propose a novel approach to online imitation learning that leverages reward-free world models. Our method learns environmental dynamics entirely in latent spaces without reconstruction, enabling efficient and accurate modeling. We adopt the inverse soft-Q learning objective, reformulating the optimization process in the Q-policy space to mitigate the instability associated with traditional optimization in the reward-policy space. By employing a learned latent dynamics model and planning for control, our approach consistently achieves stable, expert-level performance in tasks with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tobyleelsz/iqmpc
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Human Motion and Animation · Advanced Vision and Imaging

MethodsSparse Evolutionary Training