Improving Generalization of Complex Models under Unbounded Loss Using PAC-Bayes Bounds
Xitong Zhang, Avrajit Ghosh, Guangliang Liu, Rongrong Wang

TL;DR
This paper introduces a new PAC-Bayes training algorithm that effectively handles unbounded loss functions, jointly trains prior and posterior, and achieves test accuracies comparable to traditional empirical risk minimization methods.
Contribution
It presents a novel PAC-Bayes bound for unbounded loss and a training procedure that jointly optimizes prior and posterior without extensive prior tuning.
Findings
Outperforms existing PAC-Bayes algorithms in various tasks.
Achieves test accuracy close to ERM with optimal regularization.
Demonstrates effectiveness across multiple neural network architectures.
Abstract
Previous research on PAC-Bayes learning theory has focused extensively on establishing tight upper bounds for test errors. A recently proposed training procedure called PAC-Bayes training, updates the model toward minimizing these bounds. Although this approach is theoretically sound, in practice, it has not achieved a test error as low as those obtained by empirical risk minimization (ERM) with carefully tuned regularization hyperparameters. Additionally, existing PAC-Bayes training algorithms often require bounded loss functions and may need a search over priors with additional datasets, which limits their broader applicability. In this paper, we introduce a new PAC-Bayes training algorithm with improved performance and reduced reliance on prior tuning. This is achieved by establishing a new PAC-Bayes bound for unbounded loss and a theoretically grounded approach that involves jointly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Generative Adversarial Networks and Image Synthesis
MethodsAdam
