EvoRL: A GPU-accelerated Framework for Evolutionary Reinforcement Learning
Bowen Zheng, Ran Cheng, Kay Chen Tan

TL;DR
EvoRL is a GPU-accelerated framework that significantly speeds up evolutionary reinforcement learning by executing the entire training pipeline on accelerators, enabling scalable large-population training and comprehensive research capabilities.
Contribution
It introduces the first end-to-end GPU-optimized EvoRL framework with hierarchical parallelism, supporting various algorithms and hybrid paradigms for scalable and flexible research.
Findings
Achieves superior speed and scalability for large populations.
Supports a wide range of RL and evolutionary algorithms.
Facilitates fair benchmarking and ablation studies.
Abstract
Evolutionary Reinforcement Learning (EvoRL) has emerged as a promising approach to overcoming the limitations of traditional reinforcement learning (RL) by integrating the Evolutionary Computation (EC) paradigm with RL. However, the population-based nature of EC significantly increases computational costs, thereby restricting the exploration of algorithmic design choices and scalability in large-scale settings. To address this challenge, we introduce \texttt{\textbf{EvoRL}}, the first end-to-end EvoRL framework optimized for GPU acceleration. The framework executes the entire training pipeline on accelerators, including environment simulations and EC processes, leveraging hierarchical parallelism through vectorization and compilation techniques to achieve superior speed and scalability. This design enables the efficient training of large populations on a single machine. In addition…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolutionary Algorithms and Applications · Reinforcement Learning in Robotics
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Experience Replay · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Weight Decay · Target Policy Smoothing · Batch Normalization · Dense Connections · Adam · Entropy Regularization · Convolution
