On the effect of normalization layers on Differentially Private training of deep Neural networks
Ali Davody, David Ifeoluwa Adelani, Thomas Kleinbauer, Dietrich, Klakow

TL;DR
This paper investigates how normalization layers affect the performance of differentially private training of deep neural networks, proposing a new method to integrate batch normalization with DPSGD to improve accuracy without extra privacy loss.
Contribution
It introduces a novel approach for combining batch normalization with DPSGD, enabling training of deeper networks with enhanced utility-privacy trade-offs.
Findings
Normalization layers significantly influence DPSGD utility.
The proposed method allows deeper networks to be trained with DP.
Improved utility-privacy balance achieved with the new approach.
Abstract
Differentially private stochastic gradient descent (DPSGD) is a variation of stochastic gradient descent based on the Differential Privacy (DP) paradigm, which can mitigate privacy threats that arise from the presence of sensitive information in training data. However, one major drawback of training deep neural networks with DPSGD is a reduction in the models accuracy. In this paper, we study the effect of normalization layers on the performance of DPSGD. We demonstrate that normalization layers significantly impact the utility of deep neural networks with noisy parameters and should be considered essential ingredients of training with DPSGD. In particular, we propose a novel method for integrating batch normalization with DPSGD without incurring an additional privacy loss. With our approach, we are able to train deeper networks and achieve a better utility-privacy trade-off.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning
MethodsLayer Normalization
