Demonstration-efficient Inverse Reinforcement Learning in Procedurally   Generated Environments

Alessandro Sestini; Alexander Kuhnle; Andrew D. Bagdanov

arXiv:2012.02527·cs.LG·December 7, 2020

Demonstration-efficient Inverse Reinforcement Learning in Procedurally Generated Environments

Alessandro Sestini, Alexander Kuhnle, Andrew D. Bagdanov

PDF

Open Access

TL;DR

This paper introduces DE-AIRL, a demonstration-efficient inverse reinforcement learning method tailored for procedurally generated environments, enabling reward extrapolation with fewer expert demonstrations and improved generalization.

Contribution

The paper presents DE-AIRL, a novel adversarial IRL approach that reduces demonstration requirements and enhances reward learning in procedurally generated domains.

Findings

01

DE-AIRL significantly decreases the number of demonstrations needed.

02

The method generalizes reward functions across procedural environments.

03

Effective on MiniGrid and DeepCrawl benchmarks.

Abstract

Deep Reinforcement Learning achieves very good results in domains where reward functions can be manually engineered. At the same time, there is growing interest within the community in using games based on Procedurally Content Generation (PCG) as benchmark environments since this type of environment is perfect for studying overfitting and generalization of agents under domain shift. Inverse Reinforcement Learning (IRL) can instead extrapolate reward functions from expert demonstrations, with good results even on high-dimensional problems, however there are no examples of applying these techniques to procedurally-generated environments. This is mostly due to the number of demonstrations needed to find a good reward model. We propose a technique based on Adversarial Inverse Reinforcement Learning which can significantly decrease the need for expert demonstrations in PCG games. Through the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Reinforcement Learning in Robotics · Software Engineering Research