Preference elicitation and inverse reinforcement learning
Constantin Rothkopf, Christos Dimitrakakis

TL;DR
This paper introduces a Bayesian framework for inverse reinforcement learning based on preference elicitation, enabling accurate preference recovery and improved policy derivation even from sub-optimal demonstrations.
Contribution
It generalizes Bayesian IRL by providing a statistical formulation that estimates preferences, policies, and rewards from observations, improving over prior methods.
Findings
Preferences can be accurately inferred from sub-optimal policies.
The approach yields significantly better policies aligned with true preferences.
Experimental results validate the effectiveness of the Bayesian formulation.
Abstract
We state the problem of inverse reinforcement learning in terms of preference elicitation, resulting in a principled (Bayesian) statistical formulation. This generalises previous work on Bayesian inverse reinforcement learning and allows us to obtain a posterior distribution on the agent's preferences, policy and optionally, the obtained reward sequence, from observations. We examine the relation of the resulting approach to other statistical methods for inverse reinforcement learning via analysis and experimental results. We show that preferences can be determined accurately, even if the observed agent's policy is sub-optimal with respect to its own preferences. In that case, significantly improved policies with respect to the agent's preferences are obtained, compared to both other methods and to the performance of the demonstrated policy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Multi-Objective Optimization Algorithms · Evolutionary Algorithms and Applications
