TL;DR
This paper establishes strong computational hardness results for training depth-2 ReLU neural networks, showing NP-hardness even for simple cases and deriving lower bounds on training time based on complexity hypotheses.
Contribution
It proves NP-hardness for training depth-2 ReLU networks with a single ReLU and provides lower bounds on training time under complexity assumptions, highlighting fundamental computational limits.
Findings
Training depth-2 ReLU networks is NP-hard even for a single ReLU.
Proper learning algorithms require exponential time in 1/ε² under certain hypotheses.
Upper bounds on training time match lower bounds in terms of ε dependency.
Abstract
We prove several hardness results for training depth-2 neural networks with the ReLU activation function; these networks are simply weighted sums (that may include negative coefficients) of ReLUs. Our goal is to output a depth-2 neural network that minimizes the square loss with respect to a given training set. We prove that this problem is NP-hard already for a network with a single ReLU. We also prove NP-hardness for outputting a weighted sum of ReLUs minimizing the squared error (for ) even in the realizable setting (i.e., when the labels are consistent with an unknown depth-2 ReLU network). We are also able to obtain lower bounds on the running time in terms of the desired additive error . To obtain our lower bounds, we use the Gap Exponential Time Hypothesis (Gap-ETH) as well as a new hypothesis regarding the hardness of approximating the well known Densest…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Tight Hardness Results for Training Depth-2 ReLU Networks· youtube
Taxonomy
Methods*Communicated@Fast*How Do I Communicate to Expedia?
