Rethinking Inverse Reinforcement Learning: from Data Alignment to Task   Alignment

Weichao Zhou; Wenchao Li

arXiv:2410.23680·cs.LG·November 1, 2024

Rethinking Inverse Reinforcement Learning: from Data Alignment to Task Alignment

Weichao Zhou, Wenchao Li

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a semi-supervised IRL framework that emphasizes task alignment over data alignment, improving imitation learning performance in complex and transfer scenarios.

Contribution

It proposes a novel IRL-based imitation learning framework that prioritizes task objectives using weak supervision and adversarial training, addressing reward misalignment issues.

Findings

01

Outperforms traditional IL methods in complex tasks

02

Effective in transfer learning scenarios

03

Theoretically mitigates reward misalignment

Abstract

Many imitation learning (IL) algorithms use inverse reinforcement learning (IRL) to infer a reward function that aligns with the demonstration. However, the inferred reward functions often fail to capture the underlying task objectives. In this paper, we propose a novel framework for IRL-based IL that prioritizes task alignment over conventional data alignment. Our framework is a semi-supervised approach that leverages expert demonstrations as weak supervision to derive a set of candidate reward functions that align with the task rather than only with the data. It then adopts an adversarial mechanism to train a policy with this set of reward functions to gain a collective validation of the policy's ability to accomplish the task. We provide theoretical insights into this framework's ability to mitigate task-reward misalignment and present a practical implementation. Our experimental…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zwc662/PAGAR
pytorchOfficial

Videos

Rethinking Inverse Reinforcement Learning: from Data Alignment to Task Alignment· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsALIGN · Sparse Evolutionary Training