Towards A Unified PAC-Bayesian Framework for Norm-based Generalization Bounds
Xinping Yi, Gaojie Jin, Xiaowei Huang, and Shi Jin

TL;DR
This paper introduces a unified PAC-Bayesian framework for deriving norm-based generalization bounds in deep learning, addressing previous limitations by incorporating structured weight perturbations and network sensitivities.
Contribution
It reformulates generalization bounds as a stochastic optimization over anisotropic Gaussian posteriors, enabling architecture-aware and tighter bounds.
Findings
Recover several existing PAC-Bayesian bounds as special cases
Derive bounds comparable to or tighter than state-of-the-art
Incorporate heterogeneous parameter sensitivities and architecture structures
Abstract
Understanding the generalization behavior of deep neural networks remains a fundamental challenge in modern statistical learning theory. Among existing approaches, PAC-Bayesian norm-based bounds have demonstrated particular promise due to their data-dependent nature and their ability to capture algorithmic and geometric properties of learned models. However, most existing results rely on isotropic Gaussian posteriors, heavy use of spectral-norm concentration for weight perturbations, and largely architecture-agnostic analyses, which together limit both the tightness and practical relevance of the resulting bounds. To address these limitations, in this work, we propose a unified framework for PAC-Bayesian norm-based generalization by reformulating the derivation of generalization bounds as a stochastic optimization problem over anisotropic Gaussian posteriors. The key to our approach is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Generative Adversarial Networks and Image Synthesis · Gaussian Processes and Bayesian Inference
