Optimizer Dynamics at the Edge of Stability with Differential Privacy

Ayana Hussain; Ricky Fang

arXiv:2512.19019·cs.LG·December 23, 2025

Optimizer Dynamics at the Edge of Stability with Differential Privacy

Ayana Hussain, Ricky Fang

PDF

Open Access

TL;DR

This paper investigates how differential privacy affects neural network training dynamics, revealing that privacy-preserving modifications alter stability patterns but some characteristic behaviors persist.

Contribution

It provides a detailed analysis of how differential privacy impacts optimizer stability and sharpness dynamics during training.

Findings

01

DP reduces sharpness and alters stability thresholds

02

Patterns of Edge of Stability persist under DP conditions

03

Large privacy budgets can approach classical stability limits

Abstract

Deep learning models can reveal sensitive information about individual training examples, and while differential privacy (DP) provides guarantees restricting such leakage, it also alters optimization dynamics in poorly understood ways. We study the training dynamics of neural networks under DP by comparing Gradient Descent (GD), and Adam to their privacy-preserving variants. Prior work shows that these optimizers exhibit distinct stability dynamics: full-batch methods train at the Edge of Stability (EoS), while mini-batch and adaptive methods exhibit analogous edge-of-stability behavior. At these regimes, the training loss and the sharpness--the maximum eigenvalue of the training loss Hessian--exhibit certain characteristic behavior. In DP training, per-example gradient clipping and Gaussian noise modify the update rule, and it is unclear whether these stability patterns persist. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning