Near-optimal estimates for the $\ell^p$-Lipschitz constants of deep random ReLU neural networks
Sjoerd Dirksen, Patrick Finke, Paul Geuchen, Dominik St\"oger, Felix Voigtlaender

TL;DR
This paper provides near-optimal bounds for the $\,\ell^p$-Lipschitz constants of deep random ReLU neural networks, revealing different behaviors depending on the value of p and offering insights into their stability.
Contribution
It derives high probability bounds for the $\,\ell^p$-Lipschitz constants of wide and shallow random ReLU networks, highlighting regime-dependent behaviors and near-optimal estimates.
Findings
Bounds differ by a logarithmic factor in width and linear in depth.
For p ≥ 2, Lipschitz constants resemble Gaussian vector norms.
For p < 2, Lipschitz constants are closer to the Euclidean norm.
Abstract
This paper studies the -Lipschitz constants of ReLU neural networks with random parameters for . The distribution of the weights follows a variant of the He initialization and the biases are drawn from symmetric distributions. We derive high probability upper and lower bounds for wide networks that differ at most by a factor that is logarithmic in the network's width and linear in its depth. In the special case of shallow networks, we obtain matching bounds. Remarkably, the behavior of the -Lipschitz constant varies significantly between the regimes and . For , the -Lipschitz constant behaves similarly to , where is a -dimensional standard Gaussian vector and . In contrast, for , the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and ELM · Image and Signal Denoising Methods
Methods*Communicated@Fast*How Do I Communicate to Expedia?
