An Approximation Algorithm for training One-Node ReLU Neural Network

Santanu S. Dey; Guanyi Wang; Yao Xie

arXiv:1810.03592·math.OC·May 23, 2019·6 cites

An Approximation Algorithm for training One-Node ReLU Neural Network

Santanu S. Dey, Guanyi Wang, Yao Xie

PDF

Open Access

TL;DR

This paper proves NP-hardness of training a one-node ReLU neural network and introduces an approximation algorithm with provable guarantees, outperforming gradient descent in some scenarios and serving as a good initialization method.

Contribution

The paper presents a novel approximation algorithm for training One-Node-ReLU with theoretical guarantees and demonstrates its practical advantages over gradient descent.

Findings

01

The algorithm guarantees a rac{n}{k} approximation for arbitrary data.

02

In the realizable case, the algorithm finds the global optimal solution.

03

The algorithm outperforms gradient descent and improves initialization for training.

Abstract

Training a one-node neural network with ReLU activation function (One-Node-ReLU) is a fundamental optimization problem in deep learning. In this paper, we begin with proving the NP-hardness of training One-Node-ReLU. We then present an approximation algorithm to solve One-Node-ReLU whose running time is $O (n^{k})$ where $n$ is the number of samples, $k$ is a predefined integral constant. Except $k$ , this algorithm does not require pre-processing or tuning of parameters. We analyze the performance of this algorithm under various regimes. First, given any arbitrary set of training sample data set, we show that the algorithm guarantees a $\frac{n}{k}$ -approximation for training One-Node-ReLU problem. As a consequence, in the realizable case (i.e. when the training error is zero), this approximation algorithm achieves the global optimal solution for the One-Node-ReLU problem.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing and 3D Reconstruction · Neural Networks and Applications · Generative Adversarial Networks and Image Synthesis