A Ranking Game for Imitation Learning
Harshit Sikchi, Akanksha Saran, Wonjoon Goo, Scott Niekum

TL;DR
This paper introduces a novel ranking-based game framework for imitation learning that combines expert demonstrations and preference data, improving sample efficiency and solving complex tasks.
Contribution
It presents a new ranking game framework and a ranking loss algorithm that integrate demonstrations and preferences for enhanced imitation learning.
Findings
Achieves state-of-the-art sample efficiency.
Solves previously unsolvable tasks in Learning from Observation.
Demonstrates effectiveness of combined data modalities.
Abstract
We propose a new framework for imitation learning -- treating imitation as a two-player ranking-based game between a policy and a reward. In this game, the reward agent learns to satisfy pairwise performance rankings between behaviors, while the policy agent learns to maximize this reward. In imitation learning, near-optimal expert data can be difficult to obtain, and even in the limit of infinite data cannot imply a total ordering over trajectories as preferences can. On the other hand, learning from preferences alone is challenging as a large number of preferences are required to infer a high-dimensional reward function, though preference data is typically much easier to collect than expert demonstrations. The classical inverse reinforcement learning (IRL) formulation learns from expert demonstrations but provides no mechanism to incorporate learning from offline preferences and vice…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Human Pose and Action Recognition · Multimodal Machine Learning Applications
