Almost Sure Convergence Analysis of Differentially Private Stochastic Gradient Methods
Amartya Mukherjee, Jun Liu

TL;DR
This paper proves that differentially private stochastic gradient descent algorithms almost surely converge under standard assumptions, strengthening the theoretical understanding of their long-term behavior in both convex and nonconvex settings.
Contribution
It establishes almost sure convergence of DP-SGD and its variants, extending theoretical guarantees beyond expectation or high probability to pathwise stability.
Findings
DP-SGD converges almost surely under standard conditions.
Momentum variants like DP-SHB and DP-NAG also exhibit similar convergence guarantees.
Results reinforce the stability of private optimization algorithms in practice.
Abstract
Differentially private stochastic gradient descent (DP-SGD) has become the standard algorithm for training machine learning models with rigorous privacy guarantees. Despite its widespread use, the theoretical understanding of its long-run behavior remains limited: existing analyses typically establish convergence in expectation or with high probability, but do not address the almost sure convergence of single trajectories. In this work, we prove that DP-SGD converges almost surely under standard smoothness assumptions, both in nonconvex and strongly convex settings, provided the step sizes satisfy some standard decaying conditions. Our analysis extends to momentum variants such as the stochastic heavy ball (DP-SHB) and Nesterov's accelerated gradient (DP-NAG), where we show that careful energy constructions yield similar guarantees. These results provide stronger theoretical foundations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Markov Chains and Monte Carlo Methods
