Minimum width for universal approximation using ReLU networks on compact   domain

Namjun Kim; Chanho Min; Sejun Park

arXiv:2309.10402·cs.LG·March 6, 2024·1 cites

Minimum width for universal approximation using ReLU networks on compact domain

Namjun Kim, Chanho Min, Sejun Park

PDF

Open Access 1 Video

TL;DR

This paper precisely characterizes the minimum width of ReLU-like neural networks needed for universal approximation on compact domains, revealing differences from known results on unbounded domains and establishing bounds for general activations.

Contribution

It provides an exact value for the minimum width for $L^p$ approximation with ReLU-like networks on compact domains and establishes lower bounds for uniform approximation with general activations.

Findings

01

Minimum width for $L^p$ approximation is $ ext{max}igrace{d_x, d_y, 2igrace}$ for ReLU-like activations.

02

On compact domains, the minimum width is smaller than on unbounded domains for ReLU networks.

03

Lower bounds for uniform approximation show a dichotomy between $L^p$ and uniform approximation for certain dimensions.

Abstract

It has been shown that deep neural networks of a large enough width are universal approximators but they are not if the width is too small. There were several attempts to characterize the minimum width $w_{m i n}$ enabling the universal approximation property; however, only a few of them found the exact values. In this work, we show that the minimum width for $L^{p}$ approximation of $L^{p}$ functions from $[0, 1]^{d_{x}}$ to $R^{d_{y}}$ is exactly $max {d_{x}, d_{y}, 2}$ if an activation function is ReLU-Like (e.g., ReLU, GELU, Softplus). Compared to the known result for ReLU networks, $w_{m i n} = max {d_{x} + 1, d_{y}}$ when the domain is $R^{d_{x}}$ , our result first shows that approximation on a compact domain requires smaller width than on $R^{d_{x}}$ . We next prove a lower bound on $w_{m i n}$ for uniform approximation using general activation functions including…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Minimum width for universal approximation using ReLU networks on compact domain· slideslive

Taxonomy

TopicsMachine Learning and ELM · Stochastic Gradient Optimization Techniques · Neural Networks and Applications