The Computational Complexity of ReLU Network Training Parameterized by Data Dimensionality
Vincent Froese, Christoph Hertrich, Rolf Niedermeier

TL;DR
This paper investigates the computational complexity of training two-layer ReLU neural networks, showing that training remains hard with increasing data dimension and extending known algorithms to broader loss functions.
Contribution
It provides W[1]-hardness lower bounds for training complexity based on data dimension and extends polynomial-time algorithms to more general loss functions.
Findings
Training complexity is W[1]-hard with respect to data dimension.
Known brute-force strategies are essentially optimal under ETH.
Extended polynomial-time algorithms to broader loss functions.
Abstract
Understanding the computational complexity of training simple neural networks with rectified linear units (ReLUs) has recently been a subject of intensive research. Closing gaps and complementing results from the literature, we present several results on the parameterized complexity of training two-layer ReLU networks with respect to various loss functions. After a brief discussion of other parameters, we focus on analyzing the influence of the dimension of the training data on the computational complexity. We provide running time lower bounds in terms of W[1]-hardness for parameter and prove that known brute-force strategies are essentially optimal (assuming the Exponential Time Hypothesis). In comparison with previous work, our results hold for a broad(er) range of loss functions, including -loss for all . In particular, we extend a known…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
