Evolving Deep Learning Optimizers
Mitchell Marfinetz

TL;DR
This paper introduces a genetic algorithm framework to automatically discover and optimize deep learning optimizers, resulting in an evolved optimizer that outperforms Adam on vision tasks by leveraging evolutionary search to find novel design principles.
Contribution
The paper presents a novel evolutionary search method for discovering deep learning optimizers, revealing new design principles that outperform traditional hand-crafted algorithms.
Findings
Evolved optimizer outperforms Adam by 2.6% in aggregate fitness.
Achieves 7.7% relative improvement on CIFAR-10.
Discovered optimizer combines sign-based updates with adaptive estimation and unique hyperparameter settings.
Abstract
We present a genetic algorithm framework for automatically discovering deep learning optimization algorithms. Our approach encodes optimizers as genomes that specify combinations of primitive update terms (gradient, momentum, RMS normalization, Adam-style adaptive terms, and sign-based updates) along with hyperparameters and scheduling options. Through evolutionary search over 50 generations with a population of 50 individuals, evaluated across multiple vision tasks, we discover an evolved optimizer that outperforms Adam by 2.6% in aggregate fitness and achieves a 7.7% relative improvement on CIFAR-10. The evolved optimizer combines sign-based gradient terms with adaptive moment estimation, uses lower momentum coefficients than Adam (=0.86, =0.94), and notably disables bias correction while enabling learning rate warmup and cosine decay. Our results demonstrate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Metaheuristic Optimization Algorithms Research · Advanced Neural Network Applications
