Generalisation under gradient descent via deterministic PAC-Bayes
Eugenio Clerico, Tyler Farghly, George Deligiannidis and, Benjamin Guedj, Arnaud Doucet

TL;DR
This paper develops new deterministic PAC-Bayesian generalisation bounds for models trained with gradient descent and related algorithms, providing fully computable guarantees that depend on initial conditions and the training trajectory.
Contribution
It introduces disintegrated PAC-Bayesian bounds applicable to deterministic optimization algorithms without de-randomisation, extending theoretical understanding of generalisation in gradient-based training.
Findings
Bounds are applicable to SGD, momentum, and Hamiltonian dynamics.
Results depend on initial distribution density and Hessian along training trajectory.
Framework provides fully computable generalisation guarantees.
Abstract
We establish disintegrated PAC-Bayesian generalisation bounds for models trained with gradient descent methods or continuous gradient flows. Contrary to standard practice in the PAC-Bayesian setting, our result applies to optimisation algorithms that are deterministic, without requiring any de-randomisation step. Our bounds are fully computable, depending on the density of the initial distribution and the Hessian of the training objective over the trajectory. We show that our framework can be applied to a variety of iterative optimisation algorithms, including stochastic gradient descent (SGD), momentum-based schemes, and damped Hamiltonian dynamics.
Peer Reviews
Decision·ALT 2025
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Markov Chains and Monte Carlo Methods · Machine Learning and Algorithms
