Block-Sample MAC-Bayes Generalization Bounds

Matthias Frey; Jingge Zhu; Michael C. Gastpar

arXiv:2602.12605·cs.LG·February 16, 2026

Block-Sample MAC-Bayes Generalization Bounds

Matthias Frey, Jingge Zhu, Michael C. Gastpar

PDF

Open Access 3 Reviews

TL;DR

This paper introduces a new family of MAC-Bayes generalization bounds that focus on expected error and depend on data blocks, potentially offering tighter bounds than traditional PAC-Bayes bounds.

Contribution

It proposes a novel MAC-Bayes bound family that generalizes expectation-based bounds with block-dependent divergence terms, improving upon traditional PAC-Bayes bounds.

Findings

01

MAC-Bayes bounds can be tighter than PAC-Bayes bounds.

02

Original PAC-Bayes bounds may be vacuous while MAC-Bayes bounds remain finite.

03

High-probability MAC-Bayes bounds cannot generally be established.

Abstract

We present a family of novel block-sample MAC-Bayes bounds (mean approximately correct). While PAC-Bayes bounds (probably approximately correct) typically give bounds for the generalization error that hold with high probability, MAC-Bayes bounds have a similar form but bound the expected generalization error instead. The family of bounds we propose can be understood as a generalization of an expectation version of known PAC-Bayes bounds. Compared to standard PAC-Bayes bounds, the new bounds contain divergence terms that only depend on subsets (or \emph{blocks}) of the training data. The proposed MAC-Bayes bounds hold the promise of significantly improving upon the tightness of traditional PAC-Bayes and MAC-Bayes bounds. This is illustrated with a simple numerical example in which the original PAC-Bayes bound is vacuous regardless of the choice of prior, while the proposed family of…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 4Confidence 3

Strengths

* Although I did not verify all the proofs in detail, they appear to be sound overall. * While not significant, the block-sample bound seems to be a non-trivial generalization of the prior results.

Weaknesses

I think the authors should put more effort in the presentations of this paper because of the following reasons: * The authors throw out the definition of PAC-Bayes bounds at the very beginning, however, without giving enough explanation for the term in the expression. For instance, I do not see any description of $Q_W$ and have trouble in understanding what it means. Also, I advise the author to give a concrete example of $I(n,d)$ in addition to just saying it is proportional to $n$ and $d$. Sim

Reviewer 02Rating 8Confidence 4

Strengths

I found the block decomposition concept novel, and the authors did convince me that the technique drives tighter bounds. Some might argue that this is an interpolation between the individual-sample ($m=1$) and the bulk-sample ($m=n$) regimes (borrowing the leave-one-out analysis technique, equality (b), from the former). However, the resulting bounds are tighter as a result of this interpolation, with the "optimal block rate" clearly explored in Section 5.

Weaknesses

For those outside the PAC-Bayes community, Sections 1-2 do not provide a good introduction to the concepts involved. In particular, *what is $Q_W$*? PAC-Bayes people of course know this is the prior probability; however, even if explicitly mentioned, the role of this prior would usually still be quite confusing for generic readers. The authors did even less in this regard by expending 0 words on $Q_W$, and how the evolution to $P_{W|S_j}$ is to be construed. Very minor notation issue: $\mathbb

Reviewer 03Rating 4Confidence 2

Strengths

I found the general topic of the study interesting. I would like to stress that also the negative result is interesting.

Weaknesses

* The „impossibility result“ in Theorem 2 can be viewed as a general form of the properly cited results by Hrayr Harutyunyan, Greg Ver Steeg, and Aram Galstyan. Formal limitations of sample-wise information-theoretic generalization bounds. In 2022 IEEE Information Theory Workshop (ITW), pp. 440–445. IEEE, 2022. * I was wondering: how are the results related to the results by Recursive PAC-Bayes: A frequentist approach to sequential prior updates with no information loss YS Wu, Y Zhang, BE C

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning · Machine Learning and Algorithms