$\ell_1$-regularized Neural Networks are Improperly Learnable in   Polynomial Time

Yuchen Zhang; Jason D. Lee; Michael I. Jordan

arXiv:1510.03528·cs.LG·October 14, 2015·1 cites

$\ell_1$-regularized Neural Networks are Improperly Learnable in Polynomial Time

Yuchen Zhang, Jason D. Lee, Michael I. Jordan

PDF

Open Access

TL;DR

This paper introduces a kernel-based method for improperly learning multi-layer neural networks with bounded weights, achieving polynomial time complexity regardless of the number of neurons, applicable to various activation functions.

Contribution

It presents a novel polynomial-time learning algorithm for sparse neural networks with bounded weights, independent of network size, applicable to sigmoid-like and ReLU-like activations.

Findings

01

The method guarantees generalization error within epsilon of the target network.

02

Sample and time complexity are polynomial in input dimension and other parameters.

03

Any sufficiently sparse neural network can be learned efficiently.

Abstract

We study the improper learning of multi-layer neural networks. Suppose that the neural network to be learned has $k$ hidden layers and that the $ℓ_{1}$ -norm of the incoming weights of any neuron is bounded by $L$ . We present a kernel-based method, such that with probability at least $1 - δ$ , it learns a predictor whose generalization error is at most $ϵ$ worse than that of the neural network. The sample complexity and the time complexity of the presented method are polynomial in the input dimension and in $(1/ ϵ, lo g (1/ δ), F (k, L))$ , where $F (k, L)$ is a function depending on $(k, L)$ and on the activation function, independent of the number of neurons. The algorithm applies to both sigmoid-like activation functions and ReLU-like activation functions. It implies that any sufficiently sparse neural network is learnable in polynomial time.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Stochastic Gradient Optimization Techniques · Machine Learning and Algorithms