Offline Imitation Learning with Model-based Reverse Augmentation

Jie-Jing Shao; Hao-Sen Shi; Lan-Zhe Guo; Yu-Feng Li

arXiv:2406.12550·cs.LG·June 19, 2024

Offline Imitation Learning with Model-based Reverse Augmentation

Jie-Jing Shao, Hao-Sen Shi, Lan-Zhe Guo, Yu-Feng Li

PDF

Open Access

TL;DR

This paper introduces a novel model-based offline imitation learning framework called SRA that uses reverse dynamic models to generate trajectories, improving exploration of unobserved states and achieving state-of-the-art results.

Contribution

The paper proposes a self-paced reverse augmentation method that enhances offline imitation learning by exploring unobserved states with a reverse dynamic model.

Findings

01

Effectively mitigates covariate shift in offline IL

02

Achieves state-of-the-art performance on benchmarks

03

Enables generalization beyond expert data

Abstract

In offline Imitation Learning (IL), one of the main challenges is the \textit{covariate shift} between the expert observations and the actual distribution encountered by the agent, because it is difficult to determine what action an agent should take when outside the state distribution of the expert demonstrations. Recently, the model-free solutions introduce the supplementary data and identify the latent expert-similar samples to augment the reliable samples during learning. Model-based solutions build forward dynamic models with conservatism quantification and then generate additional trajectories in the neighborhood of expert demonstrations. However, without reward supervision, these methods are often over-conservative in the out-of-expert-support regions, because only in states close to expert-observed states can there be a preferred action enabling policy optimization. To encourage…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning