Interpolation with deep neural networks with non-polynomial activations:   necessary and sufficient numbers of neurons

Liam Madden

arXiv:2405.13738·cs.LG·September 18, 2024

Interpolation with deep neural networks with non-polynomial activations: necessary and sufficient numbers of neurons

Liam Madden

PDF

Open Access

TL;DR

This paper establishes that for a wide class of non-polynomial, real analytic activation functions, the minimal number of neurons needed for a neural network to interpolate data scales as the square root of the product of input and output dimensions, extending previous results beyond traditional activations.

Contribution

It proves that (\u221a(nd')) neurons suffice for interpolation with any real analytic, non-polynomial activation, broadening the class of activation functions where optimal neuron count is known.

Findings

01

(\u221a(nd')) neurons are sufficient for interpolation.

02

The result applies to all real analytic, non-polynomial activation functions.

03

Piecewise polynomial activations are the only practical functions excluded.

Abstract

The minimal number of neurons required for a feedforward neural network to interpolate $n$ generic input-output pairs from $R^{d} \times R^{d^{'}}$ is $Θ (n d^{'})$ . While previous results have shown that $Θ (n d^{'})$ neurons are sufficient, they have been limited to sigmoid, Heaviside, and rectified linear unit (ReLU) as the activation function. Using a different approach, we prove that $Θ (n d^{'})$ neurons are sufficient as long as the activation function is real analytic at a point and not a polynomial there. Thus, the only practical activation functions that our result does not apply to are piecewise polynomials. Importantly, this means that activation functions can be freely chosen in a problem-dependent manner without loss of interpolation power.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and ELM