Inverse Reinforcement Learning with Gaussian Process

Qifeng Qiao; Peter A. Beling

arXiv:1208.2112·cs.LG·January 22, 2013

Inverse Reinforcement Learning with Gaussian Process

Qifeng Qiao, Peter A. Beling

PDF

Open Access

TL;DR

This paper introduces a Gaussian process-based inverse reinforcement learning method that handles large state spaces without assuming reward function forms, improving accuracy and robustness over existing algorithms.

Contribution

The paper proposes a novel IRL algorithm using Gaussian processes and preference graphs, enabling scalable and assumption-free reward inference.

Findings

01

Better accuracy in apprenticeship learning compared to existing algorithms.

02

More robust to the number of observations.

03

Handles large or infinite state spaces effectively.

Abstract

We present new algorithms for inverse reinforcement learning (IRL, or inverse optimal control) in convex optimization settings. We argue that finite-space IRL can be posed as a convex quadratic program under a Bayesian inference framework with the objective of maximum a posterior estimation. To deal with problems in large or even infinite state space, we propose a Gaussian process model and use preference graphs to represent observations of decision trajectories. Our method is distinguished from other approaches to IRL in that it makes no assumptions about the form of the reward function and yet it retains the promise of computationally manageable implementations for potential real-world applications. In comparison with an establish algorithm on small-scale numerical problems, our method demonstrated better accuracy in apprenticeship learning and a more robust dependence on the number…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Multi-Objective Optimization Algorithms · Gaussian Processes and Bayesian Inference · Reinforcement Learning in Robotics