TRAIL: Near-Optimal Imitation Learning with Suboptimal Data

Mengjiao Yang; Sergey Levine; Ofir Nachum

arXiv:2110.14770·cs.LG·October 29, 2021·5 cites

TRAIL: Near-Optimal Imitation Learning with Suboptimal Data

Mengjiao Yang, Sergey Levine, Ofir Nachum

PDF

Open Access 1 Repo 1 Video

TL;DR

TRAIL introduces a method to leverage suboptimal offline data to learn a latent action space, significantly improving sample efficiency and performance in imitation learning tasks.

Contribution

The paper proposes a novel approach that uses suboptimal data to learn a transition model and latent actions, enhancing imitation learning efficiency and effectiveness.

Findings

01

TRAIL improves imitation learning performance by up to 4x.

02

The method effectively utilizes suboptimal data for better downstream policies.

03

Theoretical analysis confirms the sample-efficiency benefits of the learned latent action space.

Abstract

The aim in imitation learning is to learn effective policies by utilizing near-optimal expert demonstrations. However, high-quality demonstrations from human experts can be expensive to obtain in large numbers. On the other hand, it is often much easier to obtain large quantities of suboptimal or task-agnostic trajectories, which are not useful for direct imitation, but can nevertheless provide insight into the dynamical structure of the environment, showing what could be done in the environment even if not what should be done. We ask the question, is it possible to utilize such suboptimal offline datasets to facilitate provably improved downstream imitation learning? In this work, we answer this question affirmatively and present training objectives that use offline datasets to learn a factored transition model whose structure enables the extraction of a latent action space. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

google-research/google-research
tfOfficial

Videos

TRAIL: Near-Optimal Imitation Learning with Suboptimal Data· slideslive

Taxonomy

TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Reinforcement Learning in Robotics