ClipUp: A Simple and Powerful Optimizer for Distribution-based Policy   Evolution

Nihat Engin Toklu; Pawe{\l} Liskowski; Rupesh Kumar Srivastava

arXiv:2008.02387·cs.NE·December 9, 2020

ClipUp: A Simple and Powerful Optimizer for Distribution-based Policy Evolution

Nihat Engin Toklu, Pawe{\l} Liskowski, Rupesh Kumar Srivastava

PDF

1 Repo

TL;DR

This paper introduces ClipUp, a simple yet effective optimizer for distribution-based policy evolution in reinforcement learning, offering advantages over Adam in hyperparameter tuning and robustness.

Contribution

The paper proposes ClipUp, a momentum-based optimizer with gradient normalization and update clipping, tailored for distribution-based policy evolution, simplifying hyperparameter tuning and improving robustness.

Findings

01

ClipUp performs competitively with Adam in reinforcement learning tasks.

02

It simplifies hyperparameter tuning and adapts well to reward scale changes.

03

Effective on challenging continuous control benchmarks, including Humanoid.

Abstract

Distribution-based search algorithms are an effective approach for evolutionary reinforcement learning of neural network controllers. In these algorithms, gradients of the total reward with respect to the policy parameters are estimated using a population of solutions drawn from a search distribution, and then used for policy optimization with stochastic gradient ascent. A common choice in the community is to use the Adam optimization algorithm for obtaining an adaptive behavior during gradient ascent, due to its success in a variety of supervised learning settings. As an alternative to Adam, we propose to enhance classical momentum-based gradient ascent with two simple techniques: gradient normalization and update clipping. We argue that the resulting optimizer called ClipUp (short for "clipped updates") is a better choice for distribution-based policy evolution because its working…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nnaisense/pgpelib
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAdam