Stability Based Generalization Bounds for Exponential Family Langevin Dynamics
Arindam Banerjee, Tiancong Chen, Xinyan Li, Yingxue Zhou

TL;DR
This paper introduces Exponential Family Langevin Dynamics (EFLD), a broad class of noisy stochastic algorithms, and derives sharper, data-dependent generalization bounds based on stability, with empirical validation on benchmark datasets.
Contribution
The paper generalizes stability-based generalization bounds to EFLD, including noisy Sign-SGD and quantized SGD, with sharper bounds and optimization guarantees.
Findings
Bounds are non-vacuous and sharper than previous bounds.
EFLD includes noisy Sign-SGD and quantized SGD as special cases.
Empirical results confirm bounds behave correctly under noisy labels.
Abstract
Recent years have seen advances in generalization bounds for noisy stochastic algorithms, especially stochastic gradient Langevin dynamics (SGLD) based on stability (Mou et al., 2018; Li et al., 2020) and information theoretic approaches (Xu and Raginsky, 2017; Negrea et al., 2019; Steinke and Zakynthinou, 2020). In this paper, we unify and substantially generalize stability based generalization bounds and make three technical contributions. First, we bound the generalization error in terms of expected (not uniform) stability which arguably leads to quantitatively sharper bounds. Second, as our main contribution, we introduce Exponential Family Langevin Dynamics (EFLD), a substantial generalization of SGLD, which includes noisy versions of Sign-SGD and quantized SGD as special cases. We establish data-dependent expected stability based generalization bounds for any EFLD algorithm with a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Statistical Methods and Inference · Stochastic Gradient Optimization Techniques
MethodsStochastic Gradient Descent · Class-activation map
