Discriminator-Guided Model-Based Offline Imitation Learning

Wenjia Zhang; Haoran Xu; Haoyi Niu; Peng Cheng; Ming Li; Heming Zhang,; Guyue Zhou; Xianyuan Zhan

arXiv:2207.00244·cs.LG·January 11, 2023·1 cites

Discriminator-Guided Model-Based Offline Imitation Learning

Wenjia Zhang, Haoran Xu, Haoyi Niu, Peng Cheng, Ming Li, Heming Zhang,, Guyue Zhou, Xianyuan Zhan

PDF

Open Access

TL;DR

This paper introduces DMIL, a discriminator-guided framework for offline imitation learning that enhances model robustness and performance by distinguishing between correct and suboptimal model rollouts, especially effective with limited expert data.

Contribution

The paper proposes a novel discriminator-guided approach that couples policy and dynamics model learning, improving robustness and handling suboptimal demonstrations in offline IL.

Findings

01

DMIL outperforms state-of-the-art methods on small datasets.

02

The discriminator effectively distinguishes correct from suboptimal rollouts.

03

The framework is extendable to suboptimal demonstration scenarios.

Abstract

Offline imitation learning (IL) is a powerful method to solve decision-making problems from expert demonstrations without reward labels. Existing offline IL methods suffer from severe performance degeneration under limited expert data. Including a learned dynamics model can potentially improve the state-action space coverage of expert data, however, it also faces challenging issues like model approximation/generalization errors and suboptimality of rollout data. In this paper, we propose the Discriminator-guided Model-based offline Imitation Learning (DMIL) framework, which introduces a discriminator to simultaneously distinguish the dynamics correctness and suboptimality of model rollout data against real expert demonstrations. DMIL adopts a novel cooperative-yet-adversarial learning strategy, which uses the discriminator to guide and couple the learning process of the policy and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Anomaly Detection Techniques and Applications · Machine Learning and Data Classification