Reliably Learning the ReLU in Polynomial Time

Surbhi Goel; Varun Kanade; Adam Klivans; Justin Thaler

arXiv:1611.10258·cs.LG·December 6, 2016·54 cites

Reliably Learning the ReLU in Polynomial Time

Surbhi Goel, Varun Kanade, Adam Klivans, Justin Thaler

PDF

Open Access

TL;DR

This paper introduces the first dimension-efficient polynomial-time algorithms for reliably learning ReLUs in the agnostic setting, enabling advances in neural network training and polynomial approximation.

Contribution

It presents the first efficient algorithms for learning ReLUs in the agnostic model with arbitrary labels, using kernel methods and polynomial approximations.

Findings

01

Algorithms run in polynomial time in the input dimension

02

Achieves a PTAS for maximizing ReLUs with error 1/ log n

03

First efficient algorithms for learning ReLU networks and convex piecewise-linear fitting

Abstract

We give the first dimension-efficient algorithms for learning Rectified Linear Units (ReLUs), which are functions of the form $x \mapsto max (0, w \cdot x)$ with $w \in S^{n - 1}$ . Our algorithm works in the challenging Reliable Agnostic learning model of Kalai, Kanade, and Mansour (2009) where the learner is given access to a distribution $D$ on labeled examples but the labeling may be arbitrary. We construct a hypothesis that simultaneously minimizes the false-positive rate and the loss on inputs given positive labels by $D$ , for any convex, bounded, and Lipschitz loss function. The algorithm runs in polynomial-time (in $n$ ) with respect to any distribution on $S^{n - 1}$ (the unit sphere in $n$ dimensions) and for any error parameter $ϵ = Ω (1/ lo g n)$ (this yields a PTAS for a question raised by F. Bach on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Anomaly Detection Techniques and Applications · Machine Learning and Algorithms