Partial Identifiability in Inverse Reinforcement Learning For Agents   With Non-Exponential Discounting

Joar Skalse; Alessandro Abate

arXiv:2412.11155·cs.LG·December 17, 2024

Partial Identifiability in Inverse Reinforcement Learning For Agents With Non-Exponential Discounting

Joar Skalse, Alessandro Abate

PDF

Open Access

TL;DR

This paper investigates the limitations of inverse reinforcement learning in accurately inferring preferences of agents with non-exponential discounting, revealing fundamental challenges in preference identification for more realistic human-like decision models.

Contribution

It extends the theoretical understanding of IRL's partial identifiability to agents with non-exponential discounting, including hyperbolic discounting, highlighting new limitations.

Findings

01

IRL cannot fully identify preferences for non-exponentially discounted agents

02

Partial identifiability results apply broadly to hyperbolic and other non-exponential discounting models

03

IRL alone is often insufficient to determine the true reward function for such agents

Abstract

The aim of inverse reinforcement learning (IRL) is to infer an agent's preferences from observing their behaviour. Usually, preferences are modelled as a reward function, $R$ , and behaviour is modelled as a policy, $π$ . One of the central difficulties in IRL is that multiple preferences may lead to the same observed behaviour. That is, $R$ is typically underdetermined by $π$ , which means that $R$ is only partially identifiable. Recent work has characterised the extent of this partial identifiability for different types of agents, including optimal and Boltzmann-rational agents. However, work so far has only considered agents that discount future reward exponentially: this is a serious limitation, especially given that extensive work in the behavioural sciences suggests that humans are better modelled as discounting hyperbolically. In this work, we newly characterise partial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuction Theory and Applications · Decision-Making and Behavioral Economics · Supply Chain and Inventory Management