Probabilistic inverse reinforcement learning in unknown environments

Aristide C. Y. Tossou; Christos Dimitrakakis

arXiv:1307.3785·stat.ML·July 16, 2013·2 cites

Probabilistic inverse reinforcement learning in unknown environments

Aristide C. Y. Tossou, Christos Dimitrakakis

PDF

Open Access

TL;DR

This paper introduces a probabilistic inverse reinforcement learning method for unknown stochastic environments, enabling the estimation of agent preferences and improved policy construction without prior knowledge of environment dynamics.

Contribution

It extends probabilistic IRL to unknown environments using simplified models and MAP estimation, resulting in convex optimization algorithms.

Findings

01

Algorithms are highly competitive against methods with known dynamics.

02

Effective in estimating preferences in unknown stochastic environments.

03

Provides a tractable approach for IRL in complex, unknown settings.

Abstract

We consider the problem of learning by demonstration from agents acting in unknown stochastic Markov environments or games. Our aim is to estimate agent preferences in order to construct improved policies for the same task that the agents are trying to solve. To do so, we extend previous probabilistic approaches for inverse reinforcement learning in known MDPs to the case of unknown dynamics or opponents. We do this by deriving two simplified probabilistic models of the demonstrator's policy and utility. For tractability, we use maximum a posteriori estimation rather than full Bayesian inference. Under a flat prior, this results in a convex optimisation problem. We find that the resulting algorithms are highly competitive against a variety of other methods for inverse reinforcement learning that do have knowledge of the dynamics.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Control Systems Optimization · Distributed Control Multi-Agent Systems