Policy-based optimization: single-step policy gradient method seen as an   evolution strategy

Jonathan Viquerat; R\'egis Duvigneau; Philippe Meliga and; Alexander Kuhnle; Elie Hachem

arXiv:2104.06175·math.OC·November 29, 2021·Neural Comput. Appl.

Policy-based optimization: single-step policy gradient method seen as an evolution strategy

Jonathan Viquerat, R\'egis Duvigneau, Philippe Meliga and, Alexander Kuhnle, Elie Hachem

PDF

1 Repo

TL;DR

This paper introduces Policy-Based Optimization (PBO), a black-box optimization method inspired by policy gradient and evolution strategies, demonstrating its effectiveness on standard functions and control law optimization.

Contribution

It presents the PBO algorithm as a novel, versatile black-box optimization technique with detailed description and initial validation, bridging policy gradient and evolution strategies.

Findings

01

PBO performs well on standard analytic functions.

02

PBO is effective for optimizing control laws for the Lorenz attractor.

03

The method shows promise as a flexible optimization tool.

Abstract

This research reports on the recent development of a black-box optimization method based on single-step deep reinforcement learning (DRL), and on its conceptual proximity to evolution strategy (ES) techniques. In the fashion of policy gradient (PG) methods, the policy-based optimization (PBO) algorithm relies on the update of a policy network to describe the density function of its next generation of individuals. The method is described in details, and its similarities to both ES and PG methods are pointed out. The relevance of the approach is then evaluated on the minimization of standard analytic functions, with comparison to classic ES techniques (ES, CMA-ES). It is then applied to the optimization of parametric control laws designed for the Lorenz attractor. Given the scarce existing literature on the method, this contribution definitely establishes the PBO method as a valid,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jviquerat/pbo
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.