Imitation Learning from Suboptimal Demonstrations via Meta-Learning An   Action Ranker

Jiangdong Fan; Hongcai He; Paul Weng; Hui Xu; Jie Shao

arXiv:2412.20193·cs.LG·December 31, 2024

Imitation Learning from Suboptimal Demonstrations via Meta-Learning An Action Ranker

Jiangdong Fan, Hongcai He, Paul Weng, Hui Xu, Jie Shao

PDF

Open Access 1 Repo

TL;DR

ILMAR is a novel imitation learning method that leverages both expert and suboptimal demonstrations through meta-learning an action ranker, significantly improving policy performance in scenarios with limited expert data.

Contribution

This paper introduces ILMAR, a meta-learning based approach that effectively utilizes suboptimal demonstrations by ranking and selectively integrating them into policy learning.

Findings

01

ILMAR outperforms previous methods on various tasks.

02

It effectively utilizes suboptimal demonstrations.

03

The approach improves policy performance with limited expert data.

Abstract

A major bottleneck in imitation learning is the requirement of a large number of expert demonstrations, which can be expensive or inaccessible. Learning from supplementary demonstrations without strict quality requirements has emerged as a powerful paradigm to address this challenge. However, previous methods often fail to fully utilize their potential by discarding non-expert data. Our key insight is that even demonstrations that fall outside the expert distribution but outperform the learned policy can enhance policy performance. To utilize this potential, we propose a novel approach named imitation learning via meta-learning an action ranker (ILMAR). ILMAR implements weighted behavior cloning (weighted BC) on a limited set of expert demonstrations along with supplementary demonstrations. It utilizes the functional of the advantage function to selectively integrate knowledge from the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

f-god6/ilmar
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Human Pose and Action Recognition · Adversarial Robustness in Machine Learning

MethodsSparse Evolutionary Training