Active Exploration for Inverse Reinforcement Learning
David Lindner, Andreas Krause, Giorgia Ramponi

TL;DR
This paper introduces AceIRL, an active IRL algorithm that efficiently learns reward functions through interaction without needing a generative model, outperforming naive strategies in simulations.
Contribution
The paper presents the first active IRL method with sample complexity bounds that does not rely on a generative model, improving learning efficiency in unknown environments.
Findings
AceIRL matches the sample complexity of generative model-based IRL.
AceIRL significantly outperforms naive exploration strategies in simulations.
Provides problem-dependent bounds relating sample complexity to suboptimality gap.
Abstract
Inverse Reinforcement Learning (IRL) is a powerful paradigm for inferring a reward function from expert demonstrations. Many IRL algorithms require a known transition model and sometimes even a known expert policy, or they at least require access to a generative model. However, these assumptions are too strong for many real-world applications, where the environment can be accessed only through sequential interaction. We propose a novel IRL algorithm: Active exploration for Inverse Reinforcement Learning (AceIRL), which actively explores an unknown environment and expert policy to quickly learn the expert's reward function and identify a good policy. AceIRL uses previous observations to construct confidence intervals that capture plausible reward functions and find exploration policies that focus on the most informative regions of the environment. AceIRL is the first approach to active…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Software Engineering Research · Energy Efficiency and Management
