To Shuffle or not to Shuffle: Auditing DP-SGD with Shuffling
Meenatchi Sundaram Muthu Selva Annamalai, Borja Balle, Jamie Hayes, Emiliano De Cristofaro

TL;DR
This paper introduces novel auditing procedures to evaluate the privacy guarantees of DP-SGD with shuffling, revealing that current models often overestimate their privacy protection, with leakage up to 4-10 times higher than reported.
Contribution
The paper develops DP-auditing methods to accurately measure privacy leakage in shuffling-based DP-SGD, exposing overestimations in existing privacy guarantees and analyzing variations in shuffling procedures.
Findings
DP models overestimate privacy guarantees by up to 4 times.
The gap between theoretical and actual privacy leakage varies with parameters.
Certain shuffling variations increase privacy leakage up to 10 times.
Abstract
The Differentially Private Stochastic Gradient Descent (DP-SGD) algorithm supports the training of machine learning (ML) models with formal Differential Privacy (DP) guarantees. Traditionally, DP-SGD processes training data in batches using Poisson subsampling to select each batch at every iteration. More recently, shuffling has become a common alternative due to its better compatibility and lower computational overhead. However, computing tight theoretical DP guarantees under shuffling remains an open problem. As a result, models trained with shuffling are often evaluated as if Poisson subsampling were used, which might result in incorrect privacy guarantees. This raises a compelling research question: can we verify whether there are gaps between the theoretical DP guarantees reported by state-of-the-art models using shuffling and their actual leakage? To do so, we define novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBusiness Process Modeling and Analysis
