Mechanism of feature learning in deep fully connected networks and kernel machines that recursively learn features
Adityanarayanan Radhakrishnan, Daniel Beaglehole, Parthe Pandit,, Mikhail Belkin

TL;DR
This paper uncovers the mechanism behind feature learning in deep neural networks, proposing a new theoretical framework that explains how features are selected and enabling a novel, backpropagation-free feature learning method applicable to various models.
Contribution
It introduces the Deep Neural Feature Ansatz, a new theory explaining feature learning, and develops Recursive Feature Machines that enhance kernel methods with state-of-the-art performance.
Findings
Deep neural networks learn features via the average gradient outer product.
The proposed mechanism explains phenomena like spurious features and the lottery ticket hypothesis.
Recursive Feature Machines outperform existing models on tabular data.
Abstract
In recent years neural networks have achieved impressive results on many technological and scientific tasks. Yet, the mechanism through which these models automatically select features, or patterns in data, for prediction remains unclear. Identifying such a mechanism is key to advancing performance and interpretability of neural networks and promoting reliable adoption of these models in scientific applications. In this paper, we identify and characterize the mechanism through which deep fully connected neural networks learn features. We posit the Deep Neural Feature Ansatz, which states that neural feature learning occurs by implementing the average gradient outer product to up-weight features strongly related to model output. Our ansatz sheds light on various deep learning phenomena including emergence of spurious features and simplicity biases and how pruning networks can increase…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Explainable Artificial Intelligence (XAI) · Domain Adaptation and Few-Shot Learning
MethodsPruning
