Policy Contrastive Imitation Learning

Jialei Huang; Zhaoheng Yin; Yingdong Hu; Yang Gao

arXiv:2307.02829·cs.LG·July 7, 2023

Policy Contrastive Imitation Learning

Jialei Huang, Zhaoheng Yin, Yingdong Hu, Yang Gao

PDF

Open Access

TL;DR

This paper introduces Policy Contrastive Imitation Learning (PCIL), a novel approach that improves imitation learning by learning a contrastive representation space, leading to more meaningful rewards and state-of-the-art performance on challenging tasks.

Contribution

PCIL proposes a contrastive representation learning method for imitation learning, enhancing discriminator quality and reward meaningfulness, with theoretical validation and superior empirical results.

Findings

01

Achieves state-of-the-art performance on DeepMind Control suite

02

Builds a smoother, more meaningful representation space

03

Outperforms existing adversarial imitation learning methods

Abstract

Adversarial imitation learning (AIL) is a popular method that has recently achieved much success. However, the performance of AIL is still unsatisfactory on the more challenging tasks. We find that one of the major reasons is due to the low quality of AIL discriminator representation. Since the AIL discriminator is trained via binary classification that does not necessarily discriminate the policy from the expert in a meaningful way, the resulting reward might not be meaningful either. We propose a new method called Policy Contrastive Imitation Learning (PCIL) to resolve this issue. PCIL learns a contrastive representation space by anchoring on different policies and generates a smooth cosine-similarity-based reward. Our proposed representation learning objective can be viewed as a stronger version of the AIL objective and provide a more meaningful comparison between the agent and the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning · Anomaly Detection Techniques and Applications