TL;DR
This paper introduces a novel prior learning approach for neural networks that enhances generalization and uncertainty estimation by leveraging scalable, structured posteriors with theoretical guarantees.
Contribution
It proposes a scalable prior learning method using sums-of-Kronecker-products and tractable objectives, improving generalization bounds and applicability to continual learning.
Findings
Effective in uncertainty estimation and generalization
Provides non-vacuous generalization bounds
Scalable to large models like ImageNet pre-trained networks
Abstract
In this work, we propose a novel prior learning method for advancing generalization and uncertainty estimation in deep neural networks. The key idea is to exploit scalable and structured posteriors of neural networks as informative priors with generalization guarantees. Our learned priors provide expressive probabilistic representations at large scale, like Bayesian counterparts of pre-trained models on ImageNet, and further produce non-vacuous generalization bounds. We also extend this idea to a continual learning framework, where the favorable properties of our priors are desirable. Major enablers are our technical contributions: (1) the sums-of-Kronecker-product computations, and (2) the derivations and optimizations of tractable objectives that lead to improved generalization bounds. Empirically, we exhaustively show the effectiveness of this method for uncertainty estimation and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
