Exploring Beyond-Demonstrator via Meta Learning-Based Reward   Extrapolation

Mingqi Yuan; Mao-on Pun

arXiv:2102.02454·cs.LG·February 28, 2022

Exploring Beyond-Demonstrator via Meta Learning-Based Reward Extrapolation

Mingqi Yuan, Mao-on Pun

PDF

Open Access

TL;DR

This paper introduces MLRE, a meta learning-based reward extrapolation method that effectively learns from limited demonstrations to outperform demonstrators, addressing data scarcity issues in imitation learning.

Contribution

The paper proposes a novel meta learning approach for reward extrapolation that requires fewer demonstrations and improves performance over existing methods.

Findings

01

MLRE outperforms similar algorithms in simulation tasks.

02

Effective with limited demonstration data.

03

Significant performance improvements demonstrated.

Abstract

Extrapolating beyond-demonstrator (BD) performance through the imitation learning (IL) algorithm aims to learn from and subsequently outperform the demonstrator. To that end, a representative approach is to leverage inverse reinforcement learning (IRL) to infer a reward function from demonstrations before performing RL on the learned reward function. However, most existing reward extrapolation methods require massive demonstrations, making it difficult to be applied in tasks of limited training data. To address this problem, one simple solution is to perform data augmentation to artificially generate more training data, which may incur severe inductive bias and policy performance loss. In this paper, we propose a novel meta learning-based reward extrapolation (MLRE) algorithm, which can effectively approximate the ground-truth rewards using limited demonstrations. More specifically,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMuscle activation and electromyography studies · Viral Infectious Diseases and Gene Expression in Insects · Reinforcement Learning in Robotics