Batch Clipping and Adaptive Layerwise Clipping for Differential Private Stochastic Gradient Descent
Toan N. Nguyen, Phuong Ha Nguyen, Lam M. Nguyen, Marten Van Dijk

TL;DR
This paper introduces Batch Clipping and Adaptive Layerwise Clipping methods for Differential Private SGD, enabling the use of Batch Normalization in deep neural networks while providing rigorous differential privacy guarantees.
Contribution
It proposes novel Batch Clipping and Adaptive Layerwise Clipping techniques with rigorous differential privacy proofs, improving deep neural network training under privacy constraints.
Findings
Batch Clipping allows the use of Batch Normalization in DP training.
The proposed methods converge on CIFAR-10 with ResNet-18, unlike previous approaches.
Rigorous DP proofs are provided for both techniques.
Abstract
Each round in Differential Private Stochastic Gradient Descent (DPSGD) transmits a sum of clipped gradients obfuscated with Gaussian noise to a central server which uses this to update a global model which often represents a deep neural network. Since the clipped gradients are computed separately, which we call Individual Clipping (IC), deep neural networks like resnet-18 cannot use Batch Normalization Layers (BNL) which is a crucial component in deep neural networks for achieving a high accuracy. To utilize BNL, we introduce Batch Clipping (BC) where, instead of clipping single gradients as in the orginal DPSGD, we average and clip batches of gradients. Moreover, the model entries of different layers have different sensitivities to the added Gaussian noise. Therefore, Adaptive Layerwise Clipping methods (ALC), where each layer has its own adaptively finetuned clipping constant, have…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning
MethodsBatch Normalization · Contrastive Language-Image Pre-training
