Inverse Reinforcement Learning using Revealed Preferences and Passive Stochastic Optimization
Vikram Krishnamurthy

TL;DR
This paper explores inverse reinforcement learning through economic revealed preferences, Bayesian methods, and adaptive Langevin dynamics, providing new techniques for utility reconstruction and behavior detection in noisy environments.
Contribution
It introduces a novel IRL framework combining revealed preference theory, Bayesian analysis, and adaptive Langevin dynamics for real-time utility estimation.
Findings
Reconstruction of utility functions from observed actions.
Detection of utility maximization behavior under noise.
Adaptive IRL algorithms for tracking time-varying utilities.
Abstract
This monograph, spanning three chapters, explores Inverse Reinforcement Learning (IRL). The first two chapters view inverse reinforcement learning (IRL) through the lens of revealed preferences from microeconomics while the third chapter studies adaptive IRL via Langevin dynamics stochastic gradient algorithms. Chapter uses classical revealed preference theory (Afriat's theorem and extensions) to identify constrained utility maximizers based on observed agent actions. This allows for the reconstruction of set-valued estimates of an agent's utility. We illustrate this procedure by identifying the presence of a cognitive radar and reconstructing its utility function. The chapter also addresses the construction of a statistical detector for utility maximization behavior when agent actions are corrupted by noise. Chapter 2 studies Bayesian IRL. It investigates how an analyst can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
