Robust Quadruped Locomotion via Evolutionary Reinforcement Learning
Brian McAteer, Karl Mason

TL;DR
This paper demonstrates that combining evolutionary search with reinforcement learning enhances the robustness of quadruped locomotion policies, especially in unseen rough terrains, outperforming standard deep RL methods.
Contribution
It evaluates and compares evolutionary reinforcement learning methods with standard deep RL, showing improved transfer performance to rough terrains.
Findings
CEM-TD3 achieves the highest reward during training and testing.
Evolutionary variants retain capabilities on rough terrain better than standard RL.
Incorporating evolutionary search reduces overfitting and improves robustness.
Abstract
Deep reinforcement learning has recently achieved strong results in quadrupedal locomotion, yet policies trained in simulation often fail to transfer when the environment changes. Evolutionary reinforcement learning aims to address this limitation by combining gradient-based policy optimisation with population-driven exploration. This work evaluates four methods on a simulated walking task: DDPG, TD3, and two Cross-Entropy-based variants CEM-DDPG and CEM-TD3. All agents are trained on flat terrain and later tested both on this domain and on a rough terrain not encountered during training. TD3 performs best among the standard deep RL baselines on flat ground with a mean reward of 5927.26, while CEM-TD3 achieves the highest rewards overall during training and evaluation 17611.41. Under the rough-terrain transfer test, performance of the deep RL methods drops sharply. DDPG achieves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
