Evolutionary Strategy Guided Reinforcement Learning via MultiBuffer   Communication

Adam Callaghan; Karl Mason; Patrick Mannion

arXiv:2306.11535·cs.NE·June 21, 2023·2 cites

Evolutionary Strategy Guided Reinforcement Learning via MultiBuffer Communication

Adam Callaghan, Karl Mason, Patrick Mannion

PDF

Open Access

TL;DR

This paper introduces a novel Evolutionary Reinforcement Learning framework that combines Evolutionary Strategies with TD3 using a multi-buffer system, enhancing policy search and performance on control tasks.

Contribution

It presents a new multi-buffer approach that improves policy exploration and performance in Evolutionary Reinforcement Learning by integrating Evolutionary Strategies with TD3.

Findings

01

Outperforms CEM-RL on 3 of 4 MuJoCo tasks

02

Enables freer policy search without buffer overpopulation issues

03

Demonstrates competitive results with state-of-the-art algorithms

Abstract

Evolutionary Algorithms and Deep Reinforcement Learning have both successfully solved control problems across a variety of domains. Recently, algorithms have been proposed which combine these two methods, aiming to leverage the strengths and mitigate the weaknesses of both approaches. In this paper we introduce a new Evolutionary Reinforcement Learning model which combines a particular family of Evolutionary algorithm called Evolutionary Strategies with the off-policy Deep Reinforcement Learning algorithm TD3. The framework utilises a multi-buffer system instead of using a single shared replay buffer. The multi-buffer system allows for the Evolutionary Strategy to search freely in the search space of policies, without running the risk of overpopulating the replay buffer with poorly performing trajectories which limit the number of desirable policy behaviour examples thus negatively…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSmart Grid Energy Management · Evolutionary Algorithms and Applications · Reinforcement Learning in Robotics

MethodsTarget Policy Smoothing · Adam · Experience Replay · Dense Connections · *Communicated@Fast*How Do I Communicate to Expedia? · Clipped Double Q-learning · Twin Delayed Deep Deterministic