Policy Search in Continuous Action Domains: an Overview

Olivier Sigaud; Freek Stulp

arXiv:1803.04706·cs.LG·June 14, 2019

Policy Search in Continuous Action Domains: an Overview

Olivier Sigaud, Freek Stulp

PDF

TL;DR

This paper provides a comprehensive overview of policy search methods in continuous action domains, highlighting their relationships, sample efficiency, and recent advances driven by deep reinforcement learning and evolutionary algorithms.

Contribution

It offers a unified perspective on diverse policy search approaches, including Bayesian Optimization and directed exploration, emphasizing their interrelations and efficiency factors.

Findings

01

Survey of various policy search methods

02

Analysis of relationships between different approaches

03

Discussion on factors affecting sample efficiency

Abstract

Continuous action policy search is currently the focus of intensive research, driven both by the recent success of deep reinforcement learning algorithms and the emergence of competitors based on evolutionary algorithms. In this paper, we present a broad survey of policy search methods, providing a unified perspective on very different approaches, including also Bayesian Optimization and directed exploration methods. The main message of this overview is in the relationship between the families of methods, but we also outline some factors underlying sample efficiency properties of the various approaches.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.