Reward Compatibility: A Framework for Inverse RL
Filippo Lazzati, Mirco Mutti, Alberto Metelli

TL;DR
This paper introduces a novel theoretical framework called reward compatibility for inverse reinforcement learning, enabling analysis and algorithms that extend IRL to large-scale MDPs and various data settings.
Contribution
The paper proposes reward compatibility as a new framework for IRL, generalizing feasible reward sets and enabling provably efficient algorithms for complex environments.
Findings
Reward compatibility quantifies how well a reward aligns with expert demonstrations.
The framework extends IRL analysis from tabular to large-scale MDPs.
Provided algorithms with sample complexity analysis for different IRL settings.
Abstract
We provide an original theoretical study of Inverse Reinforcement Learning (IRL) through the lens of reward compatibility, a novel framework to quantify the compatibility of a reward with the given expert's demonstrations. Intuitively, a reward is more compatible with the demonstrations the closer the performance of the expert's policy computed with that reward is to the optimal performance for that reward. This generalizes the notion of feasible reward set, the most common framework in the theoretical IRL literature, for which a reward is either compatible or not compatible. The grayscale introduced by the reward compatibility is the key to extend the realm of provably efficient IRL far beyond what is attainable with the feasible reward set: from tabular to large-scale MDPs. We analyze the IRL problem across various settings, including optimal and suboptimal expert's demonstrations and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification · Safety Systems Engineering in Autonomy · Software Reliability and Analysis Research
