$\ell_1$-regularized Neural Networks are Improperly Learnable in Polynomial Time
Yuchen Zhang, Jason D. Lee, Michael I. Jordan

TL;DR
This paper introduces a kernel-based method for improperly learning multi-layer neural networks with bounded weights, achieving polynomial time complexity regardless of the number of neurons, applicable to various activation functions.
Contribution
It presents a novel polynomial-time learning algorithm for sparse neural networks with bounded weights, independent of network size, applicable to sigmoid-like and ReLU-like activations.
Findings
The method guarantees generalization error within epsilon of the target network.
Sample and time complexity are polynomial in input dimension and other parameters.
Any sufficiently sparse neural network can be learned efficiently.
Abstract
We study the improper learning of multi-layer neural networks. Suppose that the neural network to be learned has hidden layers and that the -norm of the incoming weights of any neuron is bounded by . We present a kernel-based method, such that with probability at least , it learns a predictor whose generalization error is at most worse than that of the neural network. The sample complexity and the time complexity of the presented method are polynomial in the input dimension and in , where is a function depending on and on the activation function, independent of the number of neurons. The algorithm applies to both sigmoid-like activation functions and ReLU-like activation functions. It implies that any sufficiently sparse neural network is learnable in polynomial time.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Stochastic Gradient Optimization Techniques · Machine Learning and Algorithms
