Rethinking ValueDice: Does It Really Improve Performance?

Ziniu Li; Tian Xu; Yang Yu; Zhi-Quan Luo

arXiv:2202.02468·cs.LG·May 30, 2022·1 cites

Rethinking ValueDice: Does It Really Improve Performance?

Ziniu Li, Tian Xu, Yang Yu, Zhi-Quan Luo

PDF

Open Access

TL;DR

This paper critically examines ValueDice, revealing that its offline success is due to overfitting and regularization, and that its advantages diminish with incomplete expert trajectories, challenging previous assumptions about its performance.

Contribution

The paper provides a detailed analysis showing that ValueDice's improvements are largely due to regularization effects and that it is closely related to Behavioral Cloning under certain conditions.

Findings

01

ValueDice reduces to Behavioral Cloning in offline settings.

02

Regularization, such as weight decay, improves offline imitation performance.

03

ValueDice fails with subsampled expert trajectories, limiting its practical effectiveness.

Abstract

Since the introduction of GAIL, adversarial imitation learning (AIL) methods attract lots of research interests. Among these methods, ValueDice has achieved significant improvements: it beats the classical approach Behavioral Cloning (BC) under the offline setting, and it requires fewer interactions than GAIL under the online setting. Are these improvements benefited from more advanced algorithm designs? We answer this question by the following conclusions. First, we show that ValueDice could reduce to BC under the offline setting. Second, we verify that overfitting exists and regularization matters in the low-data regime. Specifically, we demonstrate that with weight decay, BC also nearly matches the expert performance as ValueDice does. The first two claims explain the superior offline performance of ValueDice. Third, we establish that ValueDice does not work when the expert…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Explainable Artificial Intelligence (XAI)

MethodsGenerative Adversarial Imitation Learning