Breaking the Curse of Dimensionality with Convex Neural Networks

Francis Bach (LIENS; SIERRA)

arXiv:1412.8690·cs.LG·November 1, 2016·322 cites

Breaking the Curse of Dimensionality with Convex Neural Networks

Francis Bach (LIENS, SIERRA)

PDF

Open Access

TL;DR

This paper analyzes neural networks with a single hidden layer and homogeneous activation functions, demonstrating their ability to adapt to low-dimensional structures and perform high-dimensional variable selection, with theoretical insights into their generalization and computational aspects.

Contribution

It provides a detailed theoretical analysis of convex neural networks, showing their adaptivity, variable selection capabilities, and conditions for convex relaxations in high-dimensional settings.

Findings

01

Neural networks with unbounded hidden units can adapt to low-dimensional structures.

02

Sparsity-inducing norms enable high-dimensional variable selection without strong data assumptions.

03

Convex relaxations can match generalization bounds, but their computational feasibility remains open.

Abstract

We consider neural networks with a single hidden layer and non-decreasing homogeneous activa-tion functions like the rectified linear units. By letting the number of hidden units grow unbounded and using classical non-Euclidean regularization tools on the output weights, we provide a detailed theoretical analysis of their generalization performance, with a study of both the approximation and the estimation errors. We show in particular that they are adaptive to unknown underlying linear structures, such as the dependence on the projection of the input variables onto a low-dimensional subspace. Moreover, when using sparsity-inducing norms on the input weights, we show that high-dimensional non-linear variable selection may be achieved, without any strong assumption regarding the data and with a total number of variables potentially exponential in the number of ob-servations. In addition,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Machine Learning and Algorithms · Machine Learning and ELM