Bottleneck Structure in Learned Features: Low-Dimension vs Regularity   Tradeoff

Arthur Jacot

arXiv:2305.19008·cs.LG·August 16, 2024·2 cites

Bottleneck Structure in Learned Features: Low-Dimension vs Regularity Tradeoff

Arthur Jacot

PDF

Open Access

TL;DR

This paper investigates the tradeoff between low-dimensional feature representations and regularity in deep neural networks, providing theoretical insights into how depth influences learned feature structure and complexity.

Contribution

It introduces finite depth corrections and formalizes the balance between low-dimensionality and regularity, proving the bottleneck structure in features as depth increases.

Findings

01

Almost all hidden representations become low-dimensional at large depth

02

Most weight matrices have singular values close to 1, others diminish with depth

03

Large learning rates are necessary for convergence of deep representations

Abstract

Previous work has shown that DNNs with large depth $L$ and $L_{2}$ -regularization are biased towards learning low-dimensional representations of the inputs, which can be interpreted as minimizing a notion of rank $R^{(0)} (f)$ of the learned function $f$ , conjectured to be the Bottleneck rank. We compute finite depth corrections to this result, revealing a measure $R^{(1)}$ of regularity which bounds the pseudo-determinant of the Jacobian $∣ J f (x) ∣_{+}$ and is subadditive under composition and addition. This formalizes a balance between learning low-dimensional representations and minimizing complexity/irregularity in the feature maps, allowing the network to learn the `right' inner dimension. Finally, we prove the conjectured bottleneck structure in the learned features as $L \to \infty$ : for large depths, almost all hidden representations are approximately…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Sparse and Compressive Sensing Techniques · Model Reduction and Neural Networks

MethodsNeural Tangent Kernel