How to Leverage Diverse Demonstrations in Offline Imitation Learning
Sheng Yue, Jiani Liu, Xingyuan Hua, Ju Ren, Sen Lin, Junshan Zhang,, Yaoxue Zhang

TL;DR
This paper presents a novel data selection method based on resulting states to improve offline imitation learning, effectively utilizing diverse behaviors and achieving state-of-the-art results on complex benchmarks.
Contribution
It introduces a simple, effective data selection criterion and a lightweight behavior cloning algorithm that leverage both expert and diverse behaviors in offline IL.
Findings
Outperforms existing methods on 20 out of 21 benchmarks.
Achieves 2-5x performance improvement over prior approaches.
Maintains similar runtime to Behavior Cloning.
Abstract
Offline Imitation Learning (IL) with imperfect demonstrations has garnered increasing attention owing to the scarcity of expert data in many real-world domains. A fundamental problem in this scenario is how to extract positive behaviors from noisy data. In general, current approaches to the problem select data building on state-action similarity to given expert demonstrations, neglecting precious information in (potentially abundant) state-actions that deviate from expert ones. In this paper, we introduce a simple yet effective data selection method that identifies positive behaviors based on their resultant states -- a more informative criterion enabling explicit utilization of dynamics information and effective extraction of both expert and beneficial diverse behaviors. Further, we devise a lightweight behavior cloning algorithm capable of leveraging the expert and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Human Pose and Action Recognition · Robot Manipulation and Learning
