Posterior and variational inference for deep neural networks with heavy-tailed weights
Isma\"el Castillo, Paul Egels

TL;DR
This paper introduces a Bayesian deep learning approach using heavy-tailed weights and ReLU activations, achieving adaptive, near-optimal contraction rates without hyperparameter tuning, applicable across various data contexts.
Contribution
It proposes a novel heavy-tailed prior for neural networks that adapts to smoothness and intrinsic dimension, with theoretical guarantees and variational Bayes extensions.
Findings
Posterior distribution achieves near-minimax contraction rates.
Method adapts to smoothness and intrinsic dimension without hyperparameter tuning.
Variational Bayes approximations retain near-optimal theoretical properties.
Abstract
We consider deep neural networks in a Bayesian framework with a prior distribution sampling the network weights at random. Following a recent idea of Agapiou and Castillo (2023), who show that heavy-tailed prior distributions achieve automatic adaptation to smoothness, we introduce a simple Bayesian deep learning prior based on heavy-tailed weights and ReLU activation. We show that the corresponding posterior distribution achieves near-optimal minimax contraction rates, simultaneously adaptive to both intrinsic dimension and smoothness of the underlying function, in a variety of contexts including nonparametric regression, geometric data and Besov spaces. While most works so far need a form of model selection built-in within the prior distribution, a key aspect of our approach is that it does not require to sample hyperparameters to learn the architecture of the network. We also provide…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSeismic Imaging and Inversion Techniques · Neural Networks and Applications · Image and Signal Denoising Methods
