Survival Dynamics of Neural and Programmatic Policies in Evolutionary Reinforcement Learning
Anton Roupassov-Ruiz, Yiyang Zuo

TL;DR
This study compares neural and programmatic policies in evolutionary reinforcement learning, showing that programmatic policies can outperform neural ones in survival duration within an ALife testbed.
Contribution
It introduces a fully specified open-source ALife testbed and provides a rigorous survival analysis comparing neural and programmatic policies.
Findings
Programmatic policies survive longer than neural policies.
SDDL agents with learning alone outperform neural agents with both learning and evaluation.
Statistically significant difference in survival probability between PERL and NERL.
Abstract
In evolutionary reinforcement learning tasks (ERL), agent policies are often encoded as small artificial neural networks (NERL). Such representations lack explicit modular structure, limiting behavioral interpretation. We investigate whether programmatic policies (PERL), implemented as soft, differentiable decision lists (SDDL), can match the performance of NERL. To support reproducible evaluation, we provide the first fully specified and open-source reimplementation of the classic 1992 Artificial Life (ALife) ERL testbed. We conduct a rigorous survival analysis across 4000 independent trials utilizing Kaplan-Meier curves and Restricted Mean Survival Time (RMST) metrics absent in the original study. We find a statistically significant difference in survival probability between PERL and NERL. PERL agents survive on average 201.69 steps longer than NERL agents. Moreover, SDDL agents using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Metaheuristic Optimization Algorithms Research
