Bellman Gradient Iteration for Inverse Reinforcement Learning

Kun Li; Yanan Sui; Joel W. Burdick

arXiv:1707.07767·cs.LG·July 26, 2017·2 cites

Bellman Gradient Iteration for Inverse Reinforcement Learning

Kun Li, Yanan Sui, Joel W. Burdick

PDF

Open Access

TL;DR

This paper introduces a novel inverse reinforcement learning algorithm that uses Bellman Gradient Iteration to recover reward functions from observed actions, offering flexibility and comparable accuracy to existing methods.

Contribution

It presents a new Bellman Gradient Iteration approach that handles different action types and learns reward functions directly from actions, improving flexibility over trajectory-based methods.

Findings

01

The method achieves accuracy comparable to non-linear reward approaches.

02

It is more flexible by learning from actions rather than trajectories.

03

Performance is validated in simulated environments.

Abstract

This paper develops an inverse reinforcement learning algorithm aimed at recovering a reward function from the observed actions of an agent. We introduce a strategy to flexibly handle different types of actions with two approximations of the Bellman Optimality Equation, and a Bellman Gradient Iteration method to compute the gradient of the Q-value with respect to the reward function. These methods allow us to build a differentiable relation between the Q-value and the reward function and learn an approximately optimal reward function with gradient methods. We test the proposed method in two simulated environments by evaluating the accuracy of different approximations and comparing the proposed method with existing solutions. The results show that even with a linear reward function, the proposed method has a comparable accuracy with the state-of-the-art method adopting a non-linear…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Multi-Objective Optimization Algorithms · Evolutionary Algorithms and Applications