Time-independent Generalization Bounds for SGLD in Non-convex Settings
Tyler Farghly, Patrick Rebeschini

TL;DR
This paper derives time-independent generalization error bounds for SGLD in non-convex settings, showing that the bounds decrease with larger sample sizes and do not depend on the number of iterations.
Contribution
The authors provide the first time-independent generalization bounds for SGLD in non-convex scenarios, leveraging Wasserstein contraction without Lipschitz gradient assumptions.
Findings
Bounds decay to zero as sample size increases
Applicable to variants with different discretizations and noise structures
Utilizes Wasserstein contraction to avoid gradient Lipschitz bounds
Abstract
We establish generalization error bounds for stochastic gradient Langevin dynamics (SGLD) with constant learning rate under the assumptions of dissipativity and smoothness, a setting that has received increased attention in the sampling/optimization literature. Unlike existing bounds for SGLD in non-convex settings, ours are time-independent and decay to zero as the sample size increases. Using the framework of uniform stability, we establish time-independent bounds by exploiting the Wasserstein contraction property of the Langevin diffusion, which also allows us to circumvent the need to bound gradients using Lipschitz-like assumptions. Our analysis also supports variants of SGLD that use different discretization methods, incorporate Euclidean projections, or use non-isotropic noise.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Advanced Neuroimaging Techniques and Applications · Advanced MRI Techniques and Applications
