Inverse Concave-Utility Reinforcement Learning is Inverse Game Theory

Mustafa Mert \c{C}elikok; Frans A. Oliehoek; Jan-Willem van de Meent

arXiv:2405.19024·cs.LG·August 5, 2024

Inverse Concave-Utility Reinforcement Learning is Inverse Game Theory

Mustafa Mert \c{C}elikok, Frans A. Oliehoek, Jan-Willem van de Meent

PDF

Open Access

TL;DR

This paper introduces a novel theoretical framework for inverse reinforcement learning with concave utilities, linking it to inverse game theory and mean-field games, addressing gaps in existing IRL methods.

Contribution

It develops a new approach to inverse CURL problems by establishing their equivalence to inverse game theory within mean-field games, which was not previously explored.

Findings

01

Most standard IRL results do not apply to CURL due to Bellman equation invalidation.

02

Proposes a new definition for feasible rewards in inverse CURL based on mean-field game equivalence.

03

Outlines future research directions and applications in human-AI collaboration.

Abstract

We consider inverse reinforcement learning problems with concave utilities. Concave Utility Reinforcement Learning (CURL) is a generalisation of the standard RL objective, which employs a concave function of the state occupancy measure, rather than a linear function. CURL has garnered recent attention for its ability to represent instances of many important applications including the standard RL such as imitation learning, pure exploration, constrained MDPs, offline RL, human-regularized RL, and others. Inverse reinforcement learning is a powerful paradigm that focuses on recovering an unknown reward function that can rationalize the observed behaviour of an agent. There has been recent theoretical advances in inverse RL where the problem is formulated as identifying the set of feasible reward functions. However, inverse RL for CURL problems has not been considered previously. In this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEvolutionary Algorithms and Applications

MethodsSparse Evolutionary Training