Curriculum Offline Imitation Learning

Minghuan Liu; Hanye Zhao; Zhengyu Yang; Jian Shen; Weinan Zhang; Li; Zhao; Tie-Yan Liu

arXiv:2111.02056·cs.LG·January 13, 2022·1 cites

Curriculum Offline Imitation Learning

Minghuan Liu, Hanye Zhao, Zhengyu Yang, Jian Shen, Weinan Zhang, Li, Zhao, Tie-Yan Liu

PDF

Open Access 1 Repo

TL;DR

This paper introduces COIL, a curriculum-based offline imitation learning method that adaptively selects better policies from datasets, outperforming traditional IL and competing with state-of-the-art offline RL methods.

Contribution

The paper proposes a novel curriculum offline imitation learning approach that improves policy quality by adaptively selecting neighboring policies, addressing IL limitations on mixed datasets.

Findings

01

COIL outperforms traditional IL on continuous control benchmarks.

02

COIL is competitive with state-of-the-art offline RL methods.

03

The experience picking strategy enhances policy learning from mixed datasets.

Abstract

Offline reinforcement learning (RL) tasks require the agent to learn from a pre-collected dataset with no further interactions with the environment. Despite the potential to surpass the behavioral policies, RL-based methods are generally impractical due to the training instability and bootstrapping the extrapolation errors, which always require careful hyperparameter tuning via online evaluation. In contrast, offline imitation learning (IL) has no such issues since it learns the policy directly without estimating the value function by bootstrapping. However, IL is usually limited in the capability of the behavioral policy and tends to learn a mediocre behavior from the dataset collected by the mixture of policies. In this paper, we aim to take advantage of IL but mitigate such a drawback. Observing that behavior cloning is able to imitate neighboring policies with less data, we propose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

apexrl/coil
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Evolutionary Algorithms and Applications