Revealing the Utilized Rank of Subspaces of Learning in Neural Networks
Isha Garg, Christian Koguchi, Eshan Verma, Daniel Ulbricht

TL;DR
This paper investigates how neural network weights utilize the available subspace, revealing that most models underutilize their capacity, and introduces a transformation to uncover low-rank structures that can reduce parameters with minimal accuracy loss.
Contribution
The authors propose a data-driven transformation to reveal the low-rank structure of neural network weights, demonstrating significant parameter reduction with minimal performance impact.
Findings
Most learned weights are full rank, not low rank.
Pre-trained models utilize a larger fraction of the available space.
Parameter reduction to 50-75% with less than 0.2% accuracy drop.
Abstract
In this work, we study how well the learned weights of a neural network utilize the space available to them. This notion is related to capacity, but additionally incorporates the interaction of the network architecture with the dataset. Most learned weights appear to be full rank, and are therefore not amenable to low rank decomposition. This deceptively implies that the weights are utilizing the entire space available to them. We propose a simple data-driven transformation that projects the weights onto the subspace where the data and the weight interact. This preserves the functional mapping of the layer and reveals its low rank structure. In our findings, we conclude that most models utilize a fraction of the available space. For instance, for ViTB-16 and ViTL-16 trained on ImageNet, the mean layer utilization is 35% and 20% respectively. Our transformation results in reducing the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
