A PAC-Bayes oracle inequality for sparse neural networks
Maximilian F. Steffen, Mathias Trabs

TL;DR
This paper establishes a PAC-Bayes oracle inequality for sparse neural networks, demonstrating that the Gibbs posterior adapts to unknown function regularity and achieves near-optimal convergence rates in nonparametric regression.
Contribution
It introduces a new oracle inequality for sparse neural networks using PAC-Bayes theory, showing adaptive minimax-optimal convergence rates.
Findings
The Gibbs posterior can be efficiently sampled using Langevin algorithms.
The method adapts to unknown regularity and hierarchical structure of the target function.
Achieves near-minimax optimal convergence rates up to a logarithmic factor.
Abstract
We study the Gibbs posterior distribution for sparse deep neural nets in a nonparametric regression setting. The posterior can be accessed via Metropolis-adjusted Langevin algorithms. Using a mixture over uniform priors on sparse sets of network weights, we prove an oracle inequality which shows that the method adapts to the unknown regularity and hierarchical structure of the regression function. The estimator achieves the minimax-optimal rate of convergence (up to a logarithmic factor).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
