Geometric compression of invariant manifolds in neural nets

Jonas Paccolat; Leonardo Petrini; Mario Geiger; Kevin Tyloo and; Matthieu Wyart

arXiv:2007.11471·cs.LG·May 7, 2021

Geometric compression of invariant manifolds in neural nets

Jonas Paccolat, Leonardo Petrini, Mario Geiger, Kevin Tyloo and, Matthieu Wyart

PDF

1 Repo

TL;DR

This paper investigates how neural networks compress uninformative input directions during training, leading to improved learning efficiency and better alignment of the neural tangent kernel with label-relevant features.

Contribution

It introduces a geometric perspective on neural network compression of invariant manifolds and quantifies its impact on learning curves and kernel evolution.

Findings

01

Compression occurs in the feature learning regime, improving test error.

02

Lazy training shows no compression and slower learning curves.

03

Kernel eigenvectors become more label-aligned due to compression.

Abstract

We study how neural networks compress uninformative input space in models where data lie in $d$ dimensions, but whose label only vary within a linear manifold of dimension $d_{∥} < d$ . We show that for a one-hidden layer network initialized with infinitesimal weights (i.e. in the feature learning regime) trained with gradient descent, the first layer of weights evolve to become nearly insensitive to the $d_{⊥} = d - d_{∥}$ uninformative directions. These are effectively compressed by a factor $λ \sim p$ , where $p$ is the size of the training set. We quantify the benefit of such a compression on the test error $ϵ$ . For large initialization of the weights (the lazy training regime), no compression occurs and for regular boundaries separating labels we find that $ϵ \sim p^{- β}$ , with $β_{Lazy} = d / (3 d - 2)$ . Compression improves the learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mariogeiger/feature_lazy
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsNeural Tangent Kernel