Discriminator-Guided Model-Based Offline Imitation Learning
Wenjia Zhang, Haoran Xu, Haoyi Niu, Peng Cheng, Ming Li, Heming Zhang,, Guyue Zhou, Xianyuan Zhan

TL;DR
This paper introduces DMIL, a discriminator-guided framework for offline imitation learning that enhances model robustness and performance by distinguishing between correct and suboptimal model rollouts, especially effective with limited expert data.
Contribution
The paper proposes a novel discriminator-guided approach that couples policy and dynamics model learning, improving robustness and handling suboptimal demonstrations in offline IL.
Findings
DMIL outperforms state-of-the-art methods on small datasets.
The discriminator effectively distinguishes correct from suboptimal rollouts.
The framework is extendable to suboptimal demonstration scenarios.
Abstract
Offline imitation learning (IL) is a powerful method to solve decision-making problems from expert demonstrations without reward labels. Existing offline IL methods suffer from severe performance degeneration under limited expert data. Including a learned dynamics model can potentially improve the state-action space coverage of expert data, however, it also faces challenging issues like model approximation/generalization errors and suboptimality of rollout data. In this paper, we propose the Discriminator-guided Model-based offline Imitation Learning (DMIL) framework, which introduces a discriminator to simultaneously distinguish the dynamics correctness and suboptimality of model rollout data against real expert demonstrations. DMIL adopts a novel cooperative-yet-adversarial learning strategy, which uses the discriminator to guide and couple the learning process of the policy and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Anomaly Detection Techniques and Applications · Machine Learning and Data Classification
