Inference and Interference: The Role of Clipping, Pruning and Loss   Landscapes in Differentially Private Stochastic Gradient Descent

Lauren Watson; Eric Gan; Mohan Dantam; Baharan Mirzasoleiman; Rik; Sarkar

arXiv:2311.06839·cs.LG·November 14, 2023·1 cites

Inference and Interference: The Role of Clipping, Pruning and Loss Landscapes in Differentially Private Stochastic Gradient Descent

Lauren Watson, Eric Gan, Mohan Dantam, Baharan Mirzasoleiman, Rik, Sarkar

PDF

Open Access

TL;DR

This paper investigates the dynamics of DP-SGD, revealing that clipping impacts training more than noise, and demonstrates that magnitude pruning can enhance DP-SGD performance in large neural networks.

Contribution

The study provides a detailed analysis of DP-SGD's behavior, highlighting the dominant role of clipping over noise and proposing pruning as a method to improve privacy-preserving training.

Findings

01

Clipping has a larger impact than noise on DP-SGD performance.

02

Heavy pruning can improve test accuracy of DP-SGD.

03

Behavior in later training stages determines overall results.

Abstract

Differentially private stochastic gradient descent (DP-SGD) is known to have poorer training and test performance on large neural networks, compared to ordinary stochastic gradient descent (SGD). In this paper, we perform a detailed study and comparison of the two processes and unveil several new insights. By comparing the behavior of the two processes separately in early and late epochs, we find that while DP-SGD makes slower progress in early stages, it is the behavior in the later stages that determines the end result. This separate analysis of the clipping and noise addition steps of DP-SGD shows that while noise introduces errors to the process, gradient descent can recover from these errors when it is not clipped, and clipping appears to have a larger impact than noise. These effects are amplified in higher dimensions (large neural networks), where the loss basin occupies a lower…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning · Machine Learning and ELM

MethodsPruning