Learning Neural Networks with Two Nonlinear Layers in Polynomial Time
Surbhi Goel, Adam Klivans

TL;DR
This paper presents a polynomial-time algorithm for learning neural networks with two nonlinear layers, applicable to any distribution on the unit ball, without structural assumptions, using a novel combination of isotonic regression and kernel methods.
Contribution
It introduces Alphatron, the first assumption-free, provably efficient algorithm for two-layer neural networks, extending learning theory to probabilistic concepts with noise tolerance.
Findings
Algorithm succeeds for any distribution on the unit ball.
Provides efficient oracle access to interpretable features.
Improves results for PAC learning Boolean functions in a more general setting.
Abstract
We give a polynomial-time algorithm for learning neural networks with one layer of sigmoids feeding into any Lipschitz, monotone activation function (e.g., sigmoid or ReLU). We make no assumptions on the structure of the network, and the algorithm succeeds with respect to {\em any} distribution on the unit ball in dimensions (hidden weight vectors also have unit norm). This is the first assumption-free, provably efficient algorithm for learning neural networks with two nonlinear layers. Our algorithm-- {\em Alphatron}-- is a simple, iterative update rule that combines isotonic regression with kernel methods. It outputs a hypothesis that yields efficient oracle access to interpretable features. It also suggests a new approach to Boolean learning problems via real-valued conditional-mean functions, sidestepping traditional hardness results from computational learning theory. Along…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Neural Networks and Applications
