Why to "grow" and "harvest" deep learning models?

Ilona Kulikovskikh; Tarzan Legovi\'c

arXiv:2008.03501·cs.LG·August 11, 2020

Why to "grow" and "harvest" deep learning models?

Ilona Kulikovskikh, Tarzan Legovi\'c

PDF

Open Access

TL;DR

This paper proposes a novel perspective on training deep learning models using concepts from population dynamics, demonstrating that a growth-harvesting approach with SGD outperforms traditional adaptive methods in transparency, convergence, and biases.

Contribution

It introduces a growth and harvesting framework for neural network training, showing that this approach surpasses existing adaptive gradient methods in key training requirements.

Findings

01

SGD with growth and harvesting rates outperforms adaptive methods

02

The approach improves transparency and convergence rates

03

It enhances inductive biases in deep learning models

Abstract

Current expectations from training deep learning models with gradient-based methods include: 1) transparency; 2) high convergence rates; 3) high inductive biases. While the state-of-art methods with adaptive learning rate schedules are fast, they still fail to meet the other two requirements. We suggest reconsidering neural network models in terms of single-species population dynamics where adaptation comes naturally from open-ended processes of "growth" and "harvesting". We show that the stochastic gradient descent (SGD) with two balanced pre-defined values of per capita growth and harvesting rates outperform the most common adaptive gradient methods in all of the three requirements.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Gaussian Processes and Bayesian Inference · Markov Chains and Monte Carlo Methods