Improving Differentially Private SGD via Randomly Sparsified Gradients
Junyi Zhu, Matthew B. Blaschko

TL;DR
This paper introduces a random sparsification technique for DP-SGD that improves privacy, reduces communication costs, and enhances performance by leveraging the unique properties of gradient compression in differentially private deep learning.
Contribution
It proposes a novel random sparsification extension to DP-SGD, demonstrating theoretical benefits and empirical improvements in privacy, efficiency, and model performance.
Findings
Random sparsification improves DP-SGD performance.
Sparse gradients reduce communication costs.
Sparse gradients enhance privacy against reconstruction attacks.
Abstract
Differentially private stochastic gradient descent (DP-SGD) has been widely adopted in deep learning to provide rigorously defined privacy, which requires gradient clipping to bound the maximum norm of individual gradients and additive isotropic Gaussian noise. With analysis of the convergence rate of DP-SGD in a non-convex setting, we identify that randomly sparsifying gradients before clipping and noisification adjusts a trade-off between internal components of the convergence bound and leads to a smaller upper bound when the noise is dominant. Additionally, our theoretical analysis and empirical evaluations show that the trade-off is not trivial but possibly a unique property of DP-SGD, as either canceling noisification or gradient clipping eliminates the trade-off in the bound. This observation is indicative, as it implies DP-SGD has special inherent room for (even simply random)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning
MethodsGradient Clipping
