An Approximation Algorithm for training One-Node ReLU Neural Network
Santanu S. Dey, Guanyi Wang, Yao Xie

TL;DR
This paper proves NP-hardness of training a one-node ReLU neural network and introduces an approximation algorithm with provable guarantees, outperforming gradient descent in some scenarios and serving as a good initialization method.
Contribution
The paper presents a novel approximation algorithm for training One-Node-ReLU with theoretical guarantees and demonstrates its practical advantages over gradient descent.
Findings
The algorithm guarantees a rac{n}{k} approximation for arbitrary data.
In the realizable case, the algorithm finds the global optimal solution.
The algorithm outperforms gradient descent and improves initialization for training.
Abstract
Training a one-node neural network with ReLU activation function (One-Node-ReLU) is a fundamental optimization problem in deep learning. In this paper, we begin with proving the NP-hardness of training One-Node-ReLU. We then present an approximation algorithm to solve One-Node-ReLU whose running time is where is the number of samples, is a predefined integral constant. Except , this algorithm does not require pre-processing or tuning of parameters. We analyze the performance of this algorithm under various regimes. First, given any arbitrary set of training sample data set, we show that the algorithm guarantees a -approximation for training One-Node-ReLU problem. As a consequence, in the realizable case (i.e. when the training error is zero), this approximation algorithm achieves the global optimal solution for the One-Node-ReLU problem.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction · Neural Networks and Applications · Generative Adversarial Networks and Image Synthesis
