
TL;DR
This paper provides a comprehensive theoretical framework for understanding the universal approximation property of neural networks, introducing new conditions on activation functions and architecture modifications to enhance their approximation capabilities.
Contribution
It offers a unified characterization, a construction method, and an existence proof applicable to most practical function spaces, advancing the theoretical understanding of neural network approximation.
Findings
Leaky ReLU maintains universal approximation under certain constraints
ReLU does not preserve universal approximation with imposed constraints
A simple architecture modification can approximate any continuous function with non-pathological growth
Abstract
The universal approximation property of various machine learning models is currently only understood on a case-by-case basis, limiting the rapid development of new theoretically justified neural network architectures and blurring our understanding of our current models' potential. This paper works towards overcoming these challenges by presenting a characterization, a representation, a construction method, and an existence result, each of which applies to any universal approximator on most function spaces of practical interest. Our characterization result is used to describe which activation functions allow the feed-forward architecture to maintain its universal approximation capabilities when multiple constraints are imposed on its final layers and its remaining layers are only sparsely connected. These include a rescaled and shifted Leaky ReLU activation function but not the ReLU…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Sigmoid Activation
