Feature learning in finite-width Bayesian deep linear networks with multiple outputs and convolutional layers
Federico Bassetti, Marco Gherardi, Alessandro Ingrosso, Mauro Pastore, Pietro Rotondo

TL;DR
This paper provides a rigorous Bayesian analysis of finite-width deep linear networks with multiple outputs and convolutional layers, offering exact formulas for priors and posteriors and insights into feature learning regimes.
Contribution
It introduces exact non-asymptotic integral representations and analytical formulas for priors and posteriors, advancing understanding of feature learning in complex deep linear networks.
Findings
Exact non-asymptotic prior distribution as a mixture of Gaussians
Analytical posterior distribution for squared error loss
Quantitative description of feature learning in the infinite-width regime
Abstract
Deep linear networks have been extensively studied, as they provide simplified models of deep learning. However, little is known in the case of finite-width architectures with multiple outputs and convolutional layers. In this manuscript, we provide rigorous results for the statistics of functions implemented by the aforementioned class of networks, thus moving closer to a complete characterization of feature learning in the Bayesian setting. Our results include: (i) an exact and elementary non-asymptotic integral representation for the joint prior distribution over the outputs, given in terms of a mixture of Gaussians; (ii) an analytical formula for the posterior distribution in the case of squared error loss function (Gaussian likelihood); (iii) a quantitative description of the feature learning infinite-width regime, using large deviation theory. From a physical perspective, deep…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Neural Networks and Applications · Anomaly Detection Techniques and Applications
