Offline Inverse Reinforcement Learning

Firas Jarboui; Vianney Perchet

arXiv:2106.05068·cs.LG·June 10, 2021

Offline Inverse Reinforcement Learning

Firas Jarboui, Vianney Perchet

PDF

Open Access

TL;DR

This paper introduces the first offline inverse reinforcement learning algorithm using GAN-based data augmentation, enabling learning from fixed datasets and expert demonstrations, outperforming existing methods in OpenAI gym environments.

Contribution

It presents a novel offline IRL method that leverages GANs for data augmentation, addressing the challenge of learning from fixed datasets without additional sampling.

Findings

01

Policies outperform existing solutions in OpenAI gym environments.

02

GAN-based augmentation improves IRL performance in offline settings.

03

Method effectively learns from fixed datasets with minimal expert demonstrations.

Abstract

The objective of offline RL is to learn optimal policies when a fixed exploratory demonstrations data-set is available and sampling additional observations is impossible (typically if this operation is either costly or rises ethical questions). In order to solve this problem, off the shelf approaches require a properly defined cost function (or its evaluation on the provided data-set), which are seldom available in practice. To circumvent this issue, a reasonable alternative is to query an expert for few optimal demonstrations in addition to the exploratory data-set. The objective is then to learn an optimal policy w.r.t. the expert's latent cost function. Current solutions either solve a behaviour cloning problem (which does not leverage the exploratory data) or a reinforced imitation learning problem (using a fixed cost function that discriminates available exploratory trajectories…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Machine Learning and Data Classification · Data Stream Mining Techniques