Learning Halfspaces and Neural Networks with Random Initialization
Yuchen Zhang, Jason D. Lee, Martin J. Wainwright, Michael I. Jordan

TL;DR
This paper introduces algorithms for learning halfspaces and neural networks via random initialization, achieving small excess risk with polynomial time complexity in data dimension and sample size, under certain data separability conditions.
Contribution
It presents new algorithms for non-convex learning that combine random initialization with optimization, providing guarantees for small excess risk and learnability under data separability.
Findings
Algorithms achieve arbitrarily small excess risk with polynomial complexity.
Learning is feasible under data separability with a constant margin.
Robustness to label noise with random flips is established.
Abstract
We study non-convex empirical risk minimization for learning halfspaces and neural networks. For loss functions that are -Lipschitz continuous, we present algorithms to learn halfspaces and multi-layer neural networks that achieve arbitrarily small excess risk . The time complexity is polynomial in the input dimension and the sample size , but exponential in the quantity . These algorithms run multiple rounds of random initialization followed by arbitrary optimization steps. We further show that if the data is separable by some neural network with constant margin , then there is a polynomial-time algorithm for learning a neural network that separates the training data with margin . As a consequence, the algorithm achieves arbitrary generalization error with sample and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Stochastic Gradient Optimization Techniques · Machine Learning and ELM
