Unlocking High-Accuracy Differentially Private Image Classification through Scale
Soham De, Leonard Berrada, Jamie Hayes, Samuel L. Smith, Borja Balle

TL;DR
This paper demonstrates that with proper hyper-parameter tuning and techniques, differentially private training can achieve high accuracy on image classification tasks, surpassing previous state-of-the-art results.
Contribution
The authors show that DP-SGD can perform significantly better on over-parameterized models with careful tuning, achieving new state-of-the-art results on CIFAR-10 and ImageNet.
Findings
Achieved 81.4% accuracy on CIFAR-10 with DP-SGD, surpassing previous SOTA of 71.7%.
Reached 83.8% top-1 accuracy on ImageNet under differential privacy.
Close to non-private SOTA accuracy with differential privacy, reducing the privacy-accuracy gap.
Abstract
Differential Privacy (DP) provides a formal privacy guarantee preventing adversaries with access to a machine learning model from extracting information about individual training points. Differentially Private Stochastic Gradient Descent (DP-SGD), the most popular DP training method for deep learning, realizes this protection by injecting noise during training. However previous works have found that DP-SGD often leads to a significant degradation in performance on standard image classification benchmarks. Furthermore, some authors have postulated that DP-SGD inherently performs poorly on large models, since the norm of the noise required to preserve privacy is proportional to the model dimension. In contrast, we demonstrate that DP-SGD on over-parameterized models can perform significantly better than previously thought. Combining careful hyper-parameter tuning with simple techniques to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Privacy-Preserving Technologies in Data · Advanced Neural Network Applications
MethodsResidual Connection · *Communicated@Fast*How Do I Communicate to Expedia? · Average Pooling · Residual Block · 1x1 Convolution · Kaiming Initialization · Batch Normalization · Bottleneck Residual Block · Convolution · Max Pooling
