Randomized algorithms and PAC bounds for inverse reinforcement learning   in continuous spaces

Angeliki Kamoutsi; Peter Schmitt-F\"orster; Tobias Sutter; Volkan; Cevher; and John Lygeros

arXiv:2405.15509·math.OC·May 27, 2024

Randomized algorithms and PAC bounds for inverse reinforcement learning in continuous spaces

Angeliki Kamoutsi, Peter Schmitt-F\"orster, Tobias Sutter, Volkan, Cevher, and John Lygeros

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper develops randomized algorithms with PAC bounds for inverse reinforcement learning in continuous spaces, addressing the challenges of infinite-dimensional problems and limited expert data.

Contribution

It introduces a probabilistic approach using scenario optimization and provides sample complexity bounds for inverse RL with continuous states and actions.

Findings

01

Scenario approach yields epsilon-optimal solutions with probabilistic guarantees.

02

Sample complexity bounds depend on desired accuracy and confidence levels.

03

Finite demonstration data bounds the error in inverse RL solutions.

Abstract

This work studies discrete-time discounted Markov decision processes with continuous state and action spaces and addresses the inverse problem of inferring a cost function from observed optimal behavior. We first consider the case in which we have access to the entire expert policy and characterize the set of solutions to the inverse problem by using occupation measures, linear duality, and complementary slackness conditions. To avoid trivial solutions and ill-posedness, we introduce a natural linear normalization constraint. This results in an infinite-dimensional linear feasibility problem, prompting a thorough analysis of its properties. Next, we use linear function approximators and adopt a randomized approach, namely the scenario approach and related probabilistic feasibility guarantees, to derive epsilon-optimal solutions for the inverse problem. We further discuss the sample…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

RAPACIRLCS/code
jaxOfficial

Videos

Randomized algorithms and PAC bounds for inverse reinforcement learning in continuous spaces· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Multi-Objective Optimization Algorithms · Evolutionary Algorithms and Applications

MethodsSparse Evolutionary Training