Minimum Width of Deep Narrow Networks for Universal Approximation
Xiao-Song Yang, Qi Zhou, Xuan Zhou

TL;DR
This paper establishes bounds on the minimum width of fully connected neural networks needed for universal approximation, considering various activation functions and providing new geometric proofs.
Contribution
It provides new bounds and proofs for the minimum width of neural networks for universal approximation across different activation functions.
Findings
For ELU, SELU, the minimum width bound is max(2d_x+1, d_y).
For LeakyReLU, ELU, CELU, SELU, Softplus, the bounds are d_x+1 and d_x+d_y.
A new geometric proof for the lower bound when the activation is injective.
Abstract
Determining the minimum width of fully connected neural networks has become a fundamental problem in recent theoretical studies of deep neural networks. In this paper, we study the lower bounds and upper bounds of the minimum width required for fully connected neural networks in order to have universal approximation capability, which is important in network design and training. We show that also holds true for networks with ELU, SELU activation functions, and the upper bound of this inequality is attained when , where , denote the input and output dimensions, respectively. Besides, we show that for networks with LeakyReLU, ELU, CELU, SELU, Softplus activation functions, by proving that ReLU activation function can be approximated by these activation functions. In addition, in the case that the activation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Neural Networks and Applications · Machine Learning and ELM
