Bias-Aware Minimisation: Understanding and Mitigating Estimator Bias in Private SGD
Moritz Knolle, Robert Dorfman, Alexander Ziller, Daniel Rueckert and, Georgios Kaissis

TL;DR
This paper introduces Bias-Aware Minimisation (BAM), a method to reduce estimator bias in private SGD, leading to improved model utility and privacy-utility trade-offs on standard datasets.
Contribution
The paper proposes BAM, a novel bias reduction technique for private SGD that is scalable and improves privacy-utility trade-offs, with theoretical and empirical validation.
Findings
BAM reduces bias in private gradient estimates.
BAM improves privacy-utility trade-offs on CIFAR and ImageNet datasets.
BAM scales efficiently to large neural networks.
Abstract
Differentially private SGD (DP-SGD) holds the promise of enabling the safe and responsible application of machine learning to sensitive datasets. However, DP-SGD only provides a biased, noisy estimate of a mini-batch gradient. This renders optimisation steps less effective and limits model utility as a result. With this work, we show a connection between per-sample gradient norms and the estimation bias of the private gradient oracle used in DP-SGD. Here, we propose Bias-Aware Minimisation (BAM) that allows for the provable reduction of private gradient estimator bias. We show how to efficiently compute quantities needed for BAM to scale to large neural networks and highlight similarities to closely related methods such as Sharpness-Aware Minimisation. Finally, we provide empirical evidence that BAM not only reduces bias but also substantially improves privacy-utility trade-offs on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Advanced Neural Network Applications · Medical Imaging and Analysis
MethodsBottleneck Attention Module · Stochastic Gradient Descent
