Evolutionary Strategies lead to Catastrophic Forgetting in LLMs

Immanuel Abdi; Akshat Gupta; Micah Mok; Alexander Lu; Nicholas Lee; Gopala Anumanchipalli

arXiv:2601.20861·cs.LG·January 29, 2026

Evolutionary Strategies lead to Catastrophic Forgetting in LLMs

Immanuel Abdi, Akshat Gupta, Micah Mok, Alexander Lu, Nicholas Lee, Gopala Anumanchipalli

PDF

Open Access

TL;DR

This paper investigates how Evolutionary Strategies (ES), a gradient-free learning method, cause significant forgetting in large language models during continual training, highlighting a key challenge for online learning.

Contribution

The study provides a comprehensive analysis of ES's forgetting behavior in LLMs, revealing its limitations for continual learning and comparing it to gradient-based methods like GRPO.

Findings

01

ES achieves near-GRPO performance on math and reasoning tasks

02

ES exhibits significant forgetting of prior abilities during training

03

ES updates are less sparse and have larger $ extit{l}_2$ norms than GRPO updates

Abstract

One of the biggest missing capabilities in current AI systems is the ability to learn continuously after deployment. Implementing such continually learning systems have several challenges, one of which is the large memory requirement of gradient-based algorithms that are used to train state-of-the-art LLMs. Evolutionary Strategies (ES) have recently re-emerged as a gradient-free alternative to traditional learning algorithms and have shown encouraging performance on specific tasks in LLMs. In this paper, we perform a comprehensive analysis of ES and specifically evaluate its forgetting curves when training for an increasing number of update steps. We first find that ES is able to reach performance numbers close to GRPO for math and reasoning tasks with a comparable compute budget. However, and most importantly for continual learning, the performance gains in ES is accompanied by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Data Classification · Multimodal Machine Learning Applications