Inverse Reinforcement Learning with Sub-optimal Experts
Riccardo Poiani, Gabriele Curti, Alberto Maria Metelli, Marcello, Restelli

TL;DR
This paper extends inverse reinforcement learning to incorporate multiple sub-optimal experts, analyzing the feasible reward set and its statistical complexity, with theoretical insights on reward compatibility and optimal sampling strategies.
Contribution
It introduces a novel IRL framework that accounts for sub-optimal experts and characterizes the feasible reward set's properties and estimation complexity.
Findings
Multiple sub-optimal experts reduce the feasible reward set.
The uniform sampling algorithm is minimax optimal under certain conditions.
Theoretical analysis of reward compatibility and statistical complexity.
Abstract
Inverse Reinforcement Learning (IRL) techniques deal with the problem of deducing a reward function that explains the behavior of an expert agent who is assumed to act optimally in an underlying unknown task. In several problems of interest, however, it is possible to observe the behavior of multiple experts with different degree of optimality (e.g., racing drivers whose skills ranges from amateurs to professionals). For this reason, in this work, we extend the IRL formulation to problems where, in addition to demonstrations from the optimal agent, we can observe the behavior of multiple sub-optimal experts. Given this problem, we first study the theoretical properties of the class of reward functions that are compatible with a given set of experts, i.e., the feasible reward set. Our results show that the presence of multiple sub-optimal experts can significantly shrink the set of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Sports Analytics and Performance · Auction Theory and Applications
MethodsSparse Evolutionary Training
