Policy Search in Continuous Action Domains: an Overview
Olivier Sigaud, Freek Stulp

TL;DR
This paper provides a comprehensive overview of policy search methods in continuous action domains, highlighting their relationships, sample efficiency, and recent advances driven by deep reinforcement learning and evolutionary algorithms.
Contribution
It offers a unified perspective on diverse policy search approaches, including Bayesian Optimization and directed exploration, emphasizing their interrelations and efficiency factors.
Findings
Survey of various policy search methods
Analysis of relationships between different approaches
Discussion on factors affecting sample efficiency
Abstract
Continuous action policy search is currently the focus of intensive research, driven both by the recent success of deep reinforcement learning algorithms and the emergence of competitors based on evolutionary algorithms. In this paper, we present a broad survey of policy search methods, providing a unified perspective on very different approaches, including also Bayesian Optimization and directed exploration methods. The main message of this overview is in the relationship between the families of methods, but we also outline some factors underlying sample efficiency properties of the various approaches.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
