On Optimal Hyperparameters for Differentially Private Deep Transfer Learning

Aki Rehn; Linzh Zhao; Mikko A. Heikkil\"a; Antti Honkela

arXiv:2510.20616·cs.LG·April 20, 2026

On Optimal Hyperparameters for Differentially Private Deep Transfer Learning

Aki Rehn, Linzh Zhao, Mikko A. Heikkil\"a, Antti Honkela

PDF

1 Video

TL;DR

This paper investigates optimal hyperparameter choices for differentially private transfer learning, revealing mismatches between theory and practice and analyzing how clipping bounds and batch sizes affect privacy and performance.

Contribution

It uncovers the mismatch between theoretical guidelines and empirical results for hyperparameter tuning in DP transfer learning and offers insights into better selection strategies.

Findings

01

Larger clipping bounds perform better under strong privacy, contrary to theoretical expectations.

02

Existing heuristics for batch size tuning are ineffective under fixed compute budgets.

03

Using a single hyperparameter setting across tasks can lead to suboptimal performance.

Abstract

Differentially private (DP) transfer learning, i.e., fine-tuning a pretrained model on private data, is the current state-of-the-art approach for training large models under privacy constraints. We focus on two key hyperparameters in this setting: the clipping bound $C$ and batch size $B$ . We show a clear mismatch between the current theoretical understanding of how to choose an optimal $C$ (stronger privacy requires smaller $C$ ) and empirical outcomes (larger $C$ performs better under strong privacy), caused by changes in the gradient distributions. Assuming a limited compute budget (fixed epochs), we demonstrate that the existing heuristics for tuning $B$ do not work, while cumulative DP noise better explains whether smaller or larger batches perform better. We also highlight how the common practice of using a single $(C, B)$ setting across tasks can lead to suboptimal performance. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

On Optimal Hyperparameters for Differentially Private Deep Transfer Learning· slideslive