The Implicit Bias of Gradient Descent on Generalized Gated Linear   Networks

Samuel Lippl; L. F. Abbott; SueYeon Chung

arXiv:2202.02649·stat.ML·February 8, 2022

The Implicit Bias of Gradient Descent on Generalized Gated Linear Networks

Samuel Lippl, L. F. Abbott, SueYeon Chung

PDF

Open Access 1 Repo

TL;DR

This paper analyzes the long-term behavior of gradient descent on gated linear networks, revealing how implicit biases influence learning and performance, with implications for designing better neural network architectures.

Contribution

It derives the infinite-time training limit for gated linear networks and generalizes these results to networks with homogeneous polynomial activations, linking theory to practical MNIST experiments.

Findings

01

Theoretical predictions match empirical results on MNIST.

02

Implicit bias significantly influences network performance.

03

Framework captures key aspects of ReLU network biases.

Abstract

Understanding the asymptotic behavior of gradient-descent training of deep neural networks is essential for revealing inductive biases and improving network performance. We derive the infinite-time training limit of a mathematically tractable class of deep nonlinear neural networks, gated linear networks (GLNs), and generalize these results to gated networks described by general homogeneous polynomials. We study the implications of our results, focusing first on two-layer GLNs. We then apply our theoretical predictions to GLNs trained on MNIST and show how architectural constraints and the implicit bias of gradient descent affect performance. Finally, we show that our theory captures a substantial portion of the inductive bias of ReLU networks. By making the inductive bias explicit, our framework is poised to inform the development of more efficient, biologically plausible, and robust…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sflippl/implicit-bias-glns
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and ELM · Stochastic Gradient Optimization Techniques · Neural Networks and Applications