Optimal approximation of continuous functions by very deep ReLU networks
Dmitry Yarotsky

TL;DR
This paper characterizes how deep ReLU neural networks approximate continuous functions, revealing a phase diagram with distinct approximation regimes depending on network depth and weight continuity.
Contribution
It establishes the complete phase diagram of approximation rates for deep ReLU networks, identifying the trade-offs between depth, width, and weight continuity.
Findings
Constant-depth networks achieve slower approximation rates.
Deeper networks with growing depth provide faster approximation.
Fastest approximation rate achieved by constant-width, deep networks with depth proportional to total weights.
Abstract
We consider approximations of general continuous functions on finite-dimensional cubes by general deep ReLU neural networks and study the approximation rates with respect to the modulus of continuity of the function and the total number of weights in the network. We establish the complete phase diagram of feasible approximation rates and show that it includes two distinct phases. One phase corresponds to slower approximations that can be achieved with constant-depth networks and continuous weight assignments. The other phase provides faster approximations at the cost of depths necessarily growing as a power law and with necessarily discontinuous weight assignments. In particular, we prove that constant-width fully-connected networks of depth provide the fastest possible approximation rate $\|f-\widetilde f\|_\infty =…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Machine Learning and Algorithms
Methods*Communicated@Fast*How Do I Communicate to Expedia?
