Fine-Tuning with Differential Privacy Necessitates an Additional Hyperparameter Search
Yannis Cattan, Christopher A. Choquette-Choo, Nicolas Papernot,, Abhradeep Thakurta

TL;DR
This paper demonstrates that careful layer selection and additional hyperparameter tuning in differentially private fine-tuning significantly improve privacy-utility tradeoffs, achieving state-of-the-art results on CIFAR-100.
Contribution
It reveals the importance of hyperparameter search and layer selection in DP fine-tuning, leading to improved privacy-utility tradeoffs.
Findings
Achieved 77.9% accuracy on CIFAR-100 with DP (ε=2, δ=10^{-5})
Careful layer selection enhances privacy-utility balance
Additional hyperparameter tuning is crucial for optimal DP fine-tuning
Abstract
Models need to be trained with privacy-preserving learning algorithms to prevent leakage of possibly sensitive information contained in their training data. However, canonical algorithms like differentially private stochastic gradient descent (DP-SGD) do not benefit from model scale in the same way as non-private learning. This manifests itself in the form of unappealing tradeoffs between privacy and utility (accuracy) when using DP-SGD on complex tasks. To remediate this tension, a paradigm is emerging: fine-tuning with differential privacy from a model pretrained on public (i.e., non-sensitive) training data. In this work, we identify an oversight of existing approaches for differentially private fine tuning. They do not tailor the fine-tuning approach to the specifics of learning with privacy. Our main result is to show how carefully selecting the layers being fine-tuned in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning · Stochastic Gradient Optimization Techniques
