When Do Early-Exit Networks Generalize? A PAC-Bayesian Theory of Adaptive Depth
Dongxin Guo, Jikun Wu, Siu Ming Yiu

TL;DR
This paper develops a PAC-Bayesian theoretical framework for early-exit neural networks, providing new generalization bounds based on entropy and expected depth, and demonstrating their practical advantages through extensive experiments.
Contribution
It introduces the first entropy-based generalization bounds for adaptive-depth networks, with explicit constants and conditions for outperforming fixed-depth models.
Findings
Generalization bounds depend on exit-depth entropy and expected depth.
Experiments show bounds are tight and guide threshold selection effectively.
Adaptive-depth networks can provably outperform fixed-depth counterparts under certain conditions.
Abstract
Early-exit neural networks enable adaptive computation by allowing confident predictions to exit at intermediate layers, achieving 2-8 inference speedup. Despite widespread deployment, their generalization properties lack theoretical understanding -- a gap explicitly identified in recent surveys. This paper establishes a unified PAC-Bayesian framework for adaptive-depth networks. (1) Novel Entropy-Based Bounds: We prove the first generalization bounds depending on exit-depth entropy and expected depth rather than maximum depth , with sample complexity . (2) Explicit Constructive Constants: Our analysis yields the leading coefficient with complete derivation. (3) Provable Early-Exit Advantages: We establish sufficient conditions under which adaptive-depth networks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
