Robust Action Gap Increasing with Clipped Advantage Learning

Zhe Zhang; Yaozhong Gan; Xiaoyang Tan

arXiv:2203.11677·cs.LG·March 23, 2022

Robust Action Gap Increasing with Clipped Advantage Learning

Zhe Zhang, Yaozhong Gan, Xiaoyang Tan

PDF

Open Access 1 Video

TL;DR

This paper introduces clipped Advantage Learning, an adaptive method that increases the action gap in reinforcement learning to improve robustness and convergence, validated through empirical benchmarks.

Contribution

The paper proposes a novel clipped Advantage Learning method that adaptively adjusts the advantage to balance action gap size and convergence speed.

Findings

01

Fast convergence guarantee demonstrated

02

Retains proper action gaps for robustness

03

Effective on multiple RL benchmarks

Abstract

Advantage Learning (AL) seeks to increase the action gap between the optimal action and its competitors, so as to improve the robustness to estimation errors. However, the method becomes problematic when the optimal action induced by the approximated value function does not agree with the true optimal action. In this paper, we present a novel method, named clipped Advantage Learning (clipped AL), to address this issue. The method is inspired by our observation that increasing the action gap blindly for all given samples while not taking their necessities into account could accumulate more errors in the performance loss bound, leading to a slow value convergence, and to avoid that, we should adjust the advantage value adaptively. We show that our simple clipped AL operator not only enjoys fast convergence guarantee but also retains proper action gaps, hence achieving a good balance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Robust Action Gap Increasing with Clipped Advantage Learning· underline

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and ELM · Machine Learning and Algorithms