Inverse Reinforcement Learning using Revealed Preferences and Passive Stochastic Optimization

Vikram Krishnamurthy

arXiv:2507.04396·cs.LG·July 8, 2025

Inverse Reinforcement Learning using Revealed Preferences and Passive Stochastic Optimization

Vikram Krishnamurthy

PDF

TL;DR

This paper explores inverse reinforcement learning through economic revealed preferences, Bayesian methods, and adaptive Langevin dynamics, providing new techniques for utility reconstruction and behavior detection in noisy environments.

Contribution

It introduces a novel IRL framework combining revealed preference theory, Bayesian analysis, and adaptive Langevin dynamics for real-time utility estimation.

Findings

01

Reconstruction of utility functions from observed actions.

02

Detection of utility maximization behavior under noise.

03

Adaptive IRL algorithms for tracking time-varying utilities.

Abstract

This monograph, spanning three chapters, explores Inverse Reinforcement Learning (IRL). The first two chapters view inverse reinforcement learning (IRL) through the lens of revealed preferences from microeconomics while the third chapter studies adaptive IRL via Langevin dynamics stochastic gradient algorithms. Chapter uses classical revealed preference theory (Afriat's theorem and extensions) to identify constrained utility maximizers based on observed agent actions. This allows for the reconstruction of set-valued estimates of an agent's utility. We illustrate this procedure by identifying the presence of a cognitive radar and reconstructing its utility function. The chapter also addresses the construction of a statistical detector for utility maximization behavior when agent actions are corrupted by noise. Chapter 2 studies Bayesian IRL. It investigates how an analyst can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.