A Ranking Game for Imitation Learning

Harshit Sikchi; Akanksha Saran; Wonjoon Goo; Scott Niekum

arXiv:2202.03481·cs.LG·January 18, 2023

A Ranking Game for Imitation Learning

Harshit Sikchi, Akanksha Saran, Wonjoon Goo, Scott Niekum

PDF

Open Access

TL;DR

This paper introduces a novel ranking-based game framework for imitation learning that combines expert demonstrations and preference data, improving sample efficiency and solving complex tasks.

Contribution

It presents a new ranking game framework and a ranking loss algorithm that integrate demonstrations and preferences for enhanced imitation learning.

Findings

01

Achieves state-of-the-art sample efficiency.

02

Solves previously unsolvable tasks in Learning from Observation.

03

Demonstrates effectiveness of combined data modalities.

Abstract

We propose a new framework for imitation learning -- treating imitation as a two-player ranking-based game between a policy and a reward. In this game, the reward agent learns to satisfy pairwise performance rankings between behaviors, while the policy agent learns to maximize this reward. In imitation learning, near-optimal expert data can be difficult to obtain, and even in the limit of infinite data cannot imply a total ordering over trajectories as preferences can. On the other hand, learning from preferences alone is challenging as a large number of preferences are required to infer a high-dimensional reward function, though preference data is typically much easier to collect than expert demonstrations. The classical inverse reinforcement learning (IRL) formulation learns from expert demonstrations but provides no mechanism to incorporate learning from offline preferences and vice…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Human Pose and Action Recognition · Multimodal Machine Learning Applications