Limited Evaluation Evolutionary Optimization of Large Neural Networks
Jonas Prellberg, Oliver Kramer

TL;DR
This paper explores GPU-accelerated evolutionary algorithms for training large neural networks, revealing trade-offs in accuracy and the effectiveness of simple crossover methods, and demonstrates competitive results on MNIST.
Contribution
It introduces a GPU-based evolutionary algorithm framework enabling training of large neural networks and evaluates its performance and trade-offs compared to traditional methods.
Findings
GPU implementation enables efficient batch evaluation of populations
Random uniform crossover performs well in evolutionary training
Achieved 97.6% accuracy on MNIST with an evolutionary algorithm
Abstract
Stochastic gradient descent is the most prevalent algorithm to train neural networks. However, other approaches such as evolutionary algorithms are also applicable to this task. Evolutionary algorithms bring unique trade-offs that are worth exploring, but computational demands have so far restricted exploration to small networks with few parameters. We implement an evolutionary algorithm that executes entirely on the GPU, which allows to efficiently batch-evaluate a whole population of networks. Within this framework, we explore the limited evaluation evolutionary algorithm for neural network training and find that its batch evaluation idea comes with a large accuracy trade-off. In further experiments, we explore crossover operators and find that unprincipled random uniform crossover performs extremely well. Finally, we train a network with 92k parameters on MNIST using an EA and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Metaheuristic Optimization Algorithms Research · Advanced Neural Network Applications
