Improving the Convergence of Private Shuffled Gradient Methods with Public Data

Shuli Jiang; Pranay Sharma; Zhiwei Steven Wu; Gauri Joshi

arXiv:2502.03652·cs.LG·February 25, 2026

Improving the Convergence of Private Shuffled Gradient Methods with Public Data

Shuli Jiang, Pranay Sharma, Zhiwei Steven Wu, Gauri Joshi

PDF

Open Access

TL;DR

This paper analyzes private shuffled gradient methods for convex ERM, revealing their limitations and proposing a hybrid approach with public data that improves empirical risk in differentially private training.

Contribution

It provides the first empirical excess risk bounds for DP-ShuffleG and introduces Interleaved-ShuffleG, a novel hybrid method leveraging public data to enhance privacy-utility trade-offs.

Findings

01

Data shuffling worsens empirical excess risk compared to DP-SGD.

02

Interleaved-ShuffleG reduces excess risk by integrating public data.

03

Experiments show Interleaved-ShuffleG outperforms baselines on multiple datasets.

Abstract

We consider the problem of differentially private (DP) convex empirical risk minimization (ERM). While the standard DP-SGD algorithm is theoretically well-established, practical implementations often rely on shuffled gradient methods that traverse the training data sequentially rather than sampling with replacement in each iteration. Despite their widespread use, the theoretical privacy-accuracy trade-offs of private shuffled gradient methods (\textit{DP-ShuffleG}) remain poorly understood, leading to a gap between theory and practice. In this work, we leverage privacy amplification by iteration (PABI) and a novel application of Stein's lemma to provide the first empirical excess risk bound of \textit{DP-ShuffleG}. Our result shows that data shuffling results in worse empirical excess risk for \textit{DP-ShuffleG} compared to DP-SGD. To address this limitation, we propose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Optimization Algorithms Research