Computational Complexity of Learning Neural Networks: Smoothness and Degeneracy
Amit Daniely, Nathan Srebro, Gal Vardi

TL;DR
This paper investigates the computational difficulty of learning neural networks, demonstrating that even under favorable assumptions like Gaussian inputs and non-degenerate weights, learning deeper networks remains hard, especially in smoothed-analysis settings.
Contribution
It proves new hardness results for learning depth-3 and depth-2 ReLU networks under Gaussian and smoothed conditions, extending previous understanding of neural network learnability.
Findings
Learning depth-3 ReLU networks is hard under Gaussian inputs in smoothed-analysis.
Learning depth-2 networks remains hard even with smoothed parameters and inputs.
Hardness is based on the assumption of local pseudorandom generators.
Abstract
Understanding when neural networks can be learned efficiently is a fundamental question in learning theory. Existing hardness results suggest that assumptions on both the input distribution and the network's weights are necessary for obtaining efficient algorithms. Moreover, it was previously shown that depth- networks can be efficiently learned under the assumptions that the input distribution is Gaussian, and the weight matrix is non-degenerate. In this work, we study whether such assumptions may suffice for learning deeper networks and prove negative results. We show that learning depth- ReLU networks under the Gaussian input distribution is hard even in the smoothed-analysis framework, where a random noise is added to the network's parameters. It implies that learning depth- ReLU networks under the Gaussian distribution is hard even if the weight matrices are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Algorithms · Stochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning
