Breaking the Curse of Dimensionality with Convex Neural Networks
Francis Bach (LIENS, SIERRA)

TL;DR
This paper analyzes neural networks with a single hidden layer and homogeneous activation functions, demonstrating their ability to adapt to low-dimensional structures and perform high-dimensional variable selection, with theoretical insights into their generalization and computational aspects.
Contribution
It provides a detailed theoretical analysis of convex neural networks, showing their adaptivity, variable selection capabilities, and conditions for convex relaxations in high-dimensional settings.
Findings
Neural networks with unbounded hidden units can adapt to low-dimensional structures.
Sparsity-inducing norms enable high-dimensional variable selection without strong data assumptions.
Convex relaxations can match generalization bounds, but their computational feasibility remains open.
Abstract
We consider neural networks with a single hidden layer and non-decreasing homogeneous activa-tion functions like the rectified linear units. By letting the number of hidden units grow unbounded and using classical non-Euclidean regularization tools on the output weights, we provide a detailed theoretical analysis of their generalization performance, with a study of both the approximation and the estimation errors. We show in particular that they are adaptive to unknown underlying linear structures, such as the dependence on the projection of the input variables onto a low-dimensional subspace. Moreover, when using sparsity-inducing norms on the input weights, we show that high-dimensional non-linear variable selection may be achieved, without any strong assumption regarding the data and with a total number of variables potentially exponential in the number of ob-servations. In addition,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Machine Learning and Algorithms · Machine Learning and ELM
