TL;DR
This paper introduces Policy-Based Optimization (PBO), a black-box optimization method inspired by policy gradient and evolution strategies, demonstrating its effectiveness on standard functions and control law optimization.
Contribution
It presents the PBO algorithm as a novel, versatile black-box optimization technique with detailed description and initial validation, bridging policy gradient and evolution strategies.
Findings
PBO performs well on standard analytic functions.
PBO is effective for optimizing control laws for the Lorenz attractor.
The method shows promise as a flexible optimization tool.
Abstract
This research reports on the recent development of a black-box optimization method based on single-step deep reinforcement learning (DRL), and on its conceptual proximity to evolution strategy (ES) techniques. In the fashion of policy gradient (PG) methods, the policy-based optimization (PBO) algorithm relies on the update of a policy network to describe the density function of its next generation of individuals. The method is described in details, and its similarities to both ES and PG methods are pointed out. The relevance of the approach is then evaluated on the minimization of standard analytic functions, with comparison to classic ES techniques (ES, CMA-ES). It is then applied to the optimization of parametric control laws designed for the Lorenz attractor. Given the scarce existing literature on the method, this contribution definitely establishes the PBO method as a valid,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
