TL;DR
This paper introduces fast, GPU-compatible methods for per-example gradient clipping in differentially private deep learning, significantly reducing training time and enabling more practical privacy-preserving models.
Contribution
It proposes new techniques for per-example gradient clipping that improve GPU efficiency and are compatible with auto-differentiation frameworks, accelerating private deep learning training.
Findings
Achieved 54x-94x speed-ups in training times.
Compatible with various neural network architectures.
Effective in reducing privacy-preserving training overhead.
Abstract
Recent work on Renyi Differential Privacy has shown the feasibility of applying differential privacy to deep learning tasks. Despite their promise, however, differentially private deep networks often lag far behind their non-private counterparts in accuracy, showing the need for more research in model architectures, optimizers, etc. One of the barriers to this expanded research is the training time -- often orders of magnitude larger than training non-private networks. The reason for this slowdown is a crucial privacy-related step called "per-example gradient clipping" whose naive implementation undoes the benefits of batch training with GPUs. By analyzing the back-propagation equations we derive new methods for per-example gradient clipping that are compatible with auto-differentiation (e.g., in PyTorch and TensorFlow) and provide better GPU utilization. Our implementation in PyTorch…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsGradient Clipping
