Why to "grow" and "harvest" deep learning models?
Ilona Kulikovskikh, Tarzan Legovi\'c

TL;DR
This paper proposes a novel perspective on training deep learning models using concepts from population dynamics, demonstrating that a growth-harvesting approach with SGD outperforms traditional adaptive methods in transparency, convergence, and biases.
Contribution
It introduces a growth and harvesting framework for neural network training, showing that this approach surpasses existing adaptive gradient methods in key training requirements.
Findings
SGD with growth and harvesting rates outperforms adaptive methods
The approach improves transparency and convergence rates
It enhances inductive biases in deep learning models
Abstract
Current expectations from training deep learning models with gradient-based methods include: 1) transparency; 2) high convergence rates; 3) high inductive biases. While the state-of-art methods with adaptive learning rate schedules are fast, they still fail to meet the other two requirements. We suggest reconsidering neural network models in terms of single-species population dynamics where adaptation comes naturally from open-ended processes of "growth" and "harvesting". We show that the stochastic gradient descent (SGD) with two balanced pre-defined values of per capita growth and harvesting rates outperform the most common adaptive gradient methods in all of the three requirements.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Gaussian Processes and Bayesian Inference · Markov Chains and Monte Carlo Methods
