A Theoretical Analysis on Feature Learning in Neural Networks: Emergence from Inputs and Advantage over Fixed Features
Zhenmei Shi, Junyi Wei, Yingyu Liang

TL;DR
This paper provides a theoretical analysis demonstrating that neural networks can effectively learn features from structured data, which explains their superior performance over fixed features and linear models, supported by preliminary experiments.
Contribution
The paper offers a theoretical proof that neural networks trained by gradient descent can learn effective features from structured data, unlike fixed features or polynomial-size linear models.
Findings
Neural networks succeed on structured data by learning effective features.
Linear models on data-independent features cannot achieve similar accuracy.
Removing input structure prevents polynomial algorithms from learning even weakly.
Abstract
An important characteristic of neural networks is their ability to learn representations of the input data with effective features for prediction, which is believed to be a key factor to their superior empirical performance. To better understand the source and benefit of feature learning in neural networks, we consider learning problems motivated by practical data, where the labels are determined by a set of class relevant patterns and the inputs are generated from these along with some background patterns. We prove that neural networks trained by gradient descent can succeed on these problems. The success relies on the emergence and improvement of effective features, which are learned among exponentially many candidates efficiently by exploiting the data (in particular, the structure of the input distribution). In contrast, no linear models on data-independent features of polynomial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Neural Networks and Applications · Machine Learning and Algorithms
