TL;DR
This paper introduces a new PAC-Bayesian method enabling sequential prior updates without losing confidence information, improving bounds and empirical performance in learning tasks.
Contribution
It proposes a simple, powerful PAC-Bayesian procedure for sequential prior updates with no information loss, based on a novel loss decomposition and bounds for discrete variables.
Findings
Significantly outperforms state-of-the-art methods in empirical tests.
Allows recursive confidence bounds that incorporate prior information.
Extends split-kl inequalities to discrete random variables.
Abstract
PAC-Bayesian analysis is a frequentist framework for incorporating prior knowledge into learning. It was inspired by Bayesian learning, which allows sequential data processing and naturally turns posteriors from one processing step into priors for the next. However, despite two and a half decades of research, the ability to update priors sequentially without losing confidence information along the way remained elusive for PAC-Bayes. While PAC-Bayes allows construction of data-informed priors, the final confidence intervals depend only on the number of points that were not used for the construction of the prior, whereas confidence information in the prior, which is related to the number of points used to construct the prior, is lost. This limits the possibility and benefit of sequential prior updates, because the final bounds depend only on the size of the final batch. We present a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
