Kolmogorov Width Decay and Poor Approximators in Machine Learning:   Shallow Neural Networks, Random Feature Models and Neural Tangent Kernels

Weinan E; Stephan Wojtowytsch

arXiv:2005.10807·math.FA·October 5, 2020·6 cites

Kolmogorov Width Decay and Poor Approximators in Machine Learning: Shallow Neural Networks, Random Feature Models and Neural Tangent Kernels

Weinan E, Stephan Wojtowytsch

PDF

Open Access

TL;DR

This paper demonstrates that certain neural network models and kernel methods are fundamentally limited in their ability to approximate specific function classes in high-dimensional spaces, revealing inherent inefficiencies.

Contribution

It introduces a novel scale separation technique for Kolmogorov widths and applies it to show poor approximation properties of kernel spaces and shallow networks in high dimensions.

Findings

01

Reproducing kernel Hilbert spaces are poor $L^2$-approximators for two-layer neural networks.

02

Multi-layer networks with small path norm poorly approximate certain Lipschitz functions.

03

The technique reveals fundamental limitations in high-dimensional function approximation.

Abstract

We establish a scale separation of Kolmogorov width type between subspaces of a given Banach space under the condition that a sequence of linear maps converges much faster on one of the subspaces. The general technique is then applied to show that reproducing kernel Hilbert spaces are poor $L^{2}$ -approximators for the class of two-layer neural networks in high dimension, and that multi-layer networks with small path norm are poor approximators for certain Lipschitz functions, also in the $L^{2}$ -topology.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Stochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques