Implicit Bias of Linear Equivariant Networks
Hannah Lawrence, Kristian Georgiev, Andrew Dienes, Bobak T. Kiani

TL;DR
This paper analyzes the implicit bias of linear group equivariant CNNs trained with gradient descent, revealing low-rank Fourier solutions and extending previous CNN bias results to more general group structures.
Contribution
It generalizes the understanding of implicit bias from linear CNNs to linear G-CNNs over all finite and infinite groups, including non-commutative cases.
Findings
Linear G-CNNs converge to low-rank Fourier solutions.
Regularization by Schatten matrix norm depends on network depth.
Experimental validation across various groups and nonlinear networks.
Abstract
Group equivariant convolutional neural networks (G-CNNs) are generalizations of convolutional neural networks (CNNs) which excel in a wide range of technical applications by explicitly encoding symmetries, such as rotations and permutations, in their architectures. Although the success of G-CNNs is driven by their \emph{explicit} symmetry bias, a recent line of work has proposed that the \emph{implicit} bias of training algorithms on particular architectures is key to understanding generalization for overparameterized neural nets. In this context, we show that -layer full-width linear G-CNNs trained via gradient descent for binary classification converge to solutions with low-rank Fourier matrix coefficients, regularized by the -Schatten matrix norm. Our work strictly generalizes previous analysis on the implicit bias of linear CNNs to linear G-CNNs over all finite groups,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Model Reduction and Neural Networks
