Agnostic Learning of Arbitrary ReLU Activation under Gaussian Marginals
Anxin Guo, Aravindan Vijayaraghavan

TL;DR
This paper introduces a polynomial time statistical query algorithm for learning arbitrary ReLU neurons over Gaussian data, achieving a constant factor approximation and highlighting limitations of gradient descent-based methods.
Contribution
It presents the first SQ algorithm with constant factor approximation for biased ReLU neurons, contrasting with gradient descent limitations and separating SQ from CSQ algorithms.
Findings
First SQ algorithm with constant factor approximation for biased ReLU.
Gradient descent algorithms cannot achieve this approximation in polynomial time.
Separates the capabilities of SQ and CSQ algorithms in learning ReLU neurons.
Abstract
We consider the problem of learning an arbitrarily-biased ReLU activation (or neuron) over Gaussian marginals with the squared loss objective. Despite the ReLU neuron being the basic building block of modern neural networks, we still do not understand the basic algorithmic question of whether one arbitrary ReLU neuron is learnable in the non-realizable setting. In particular, all existing polynomial time algorithms only provide approximation guarantees for the better-behaved unbiased setting or restricted bias setting. Our main result is a polynomial time statistical query (SQ) algorithm that gives the first constant factor approximation for arbitrary bias. It outputs a ReLU activation that achieves a loss of in time , where is the loss obtained by the optimal ReLU activation. Our algorithm presents an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems
Methods*Communicated@Fast*How Do I Communicate to Expedia?
