Correspondence between neuroevolution and gradient descent
Stephen Whitelam, Viktor Selin, Sang-Won Park, Isaac Tamblyn

TL;DR
This paper analytically and empirically demonstrates that neuroevolution with small mutations is equivalent to gradient descent with noise, bridging two traditionally separate neural network training methods.
Contribution
It establishes a theoretical and numerical connection between neuroevolution and gradient descent, showing their equivalence in the small mutation limit.
Findings
Neuroevolution approximates gradient descent with Gaussian noise.
The equivalence holds for both shallow and deep networks.
Finite mutations still exhibit the correspondence.
Abstract
We show analytically that training a neural network by conditioned stochastic mutation or neuroevolution of its weights is equivalent, in the limit of small mutations, to gradient descent on the loss function in the presence of Gaussian white noise. Averaged over independent realizations of the learning process, neuroevolution is equivalent to gradient descent on the loss function. We use numerical simulation to show that this correspondence can be observed for finite mutations,for shallow and deep neural networks. Our results provide a connection between two families of neural-network training methods that are usually considered to be fundamentally different.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
