Diverse Neural Network Learns True Target Functions
Bo Xie, Yingyu Liang, Le Song

TL;DR
This paper demonstrates that diverse one-hidden-layer ReLU neural networks can learn true target functions without spurious local minima, providing theoretical insights into their optimization and generalization properties.
Contribution
It introduces a novel analysis showing that diversity among units ensures no spurious local minima in non-convex neural network training.
Findings
Neural networks with diverse units have no spurious local minima.
The loss can be minimized arbitrarily if the extended feature matrix's minimum singular value is large.
A new regularization promotes unit diversity and potentially improves generalization.
Abstract
Neural networks are a powerful class of functions that can be trained with simple gradient descent to achieve state-of-the-art performance on a variety of applications. Despite their practical success, there is a paucity of results that provide theoretical guarantees on why they are so effective. Lying in the center of the problem is the difficulty of analyzing the non-convex loss function with potentially numerous local minima and saddle points. Can neural networks corresponding to the stationary points of the loss function learn the true target function? If yes, what are the key factors contributing to such nice optimization properties? In this paper, we answer these questions by analyzing one-hidden-layer neural networks with ReLU activation, and show that despite the non-convexity, neural networks with diverse units have no spurious local minima. We bypass the non-convexity issue…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
Methods*Communicated@Fast*How Do I Communicate to Expedia?
