Approximation speed of quantized vs. unquantized ReLU neural networks and beyond
Antoine Gonon (DANTE, ARIC), Nicolas Brisebarre (ARIC), R\'emi, Gribonval (DANTE), Elisa Riccietti (DANTE)

TL;DR
This paper investigates how quantization affects the approximation capabilities of ReLU neural networks and introduces the concept of infinite-encodability to compare their approximation efficiency with classical families.
Contribution
It provides bounds on the quantization bits needed for ReLU networks to retain approximation speed and introduces the novel property of infinite-encodability for approximation families.
Findings
Quantization can preserve approximation speed with a bounded number of bits.
A new lower-bound on the Lipschitz constant of network parameter mappings.
Many classical approximation families are shown to be infinite-encodable.
Abstract
We deal with two complementary questions about approximation properties of ReLU networks. First, we study how the uniform quantization of ReLU networks with real-valued weights impacts their approximation properties. We establish an upper-bound on the minimal number of bits per coordinate needed for uniformly quantized ReLU networks to keep the same polynomial asymptotic approximation speeds as unquantized ones. We also characterize the error of nearest-neighbour uniform quantization of ReLU networks. This is achieved using a new lower-bound on the Lipschitz constant of the map that associates the parameters of ReLU networks to their realization, and an upper-bound generalizing classical results. Second, we investigate when ReLU networks can be expected, or not, to have better approximation properties than other classical approximation families. Indeed, several approximation families…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Advanced Data Compression Techniques · Model Reduction and Neural Networks
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
