Learning Rate Scheduling with Matrix Factorization for Private Training
Nikita P. Kalinin, Joel Daniel Andersson

TL;DR
This paper develops theoretical bounds and practical methods for differentially private training with learning rate schedules, showing schedule-aware matrix factorizations enhance accuracy.
Contribution
It introduces a new theoretical framework for learning rate schedules in private training and proposes a schedule-aware factorization method that outperforms previous approaches.
Findings
Schedule-aware factorizations improve accuracy in private training.
Theoretical bounds are derived for various learning rate schedules.
Experiments on CIFAR-10 and IMDB datasets confirm the effectiveness.
Abstract
We study differentially private model training with stochastic gradient descent under learning rate scheduling and correlated noise. Although correlated noise, in particular via matrix factorizations, has been shown to improve accuracy, prior theoretical work focused primarily on the prefix-sum workload. That workload assumes a constant learning rate, whereas in practice learning rate schedules are widely used to accelerate training and improve convergence. We close this gap by deriving general upper and lower bounds for a broad class of learning rate schedules in both single- and multi-epoch settings. Building on these results, we propose a learning-rate-aware factorization that achieves improvements over prefix-sum factorizations under both MaxSE and MeanSE error metrics. Our theoretical analysis yields memory-efficient constructions suitable for practical deployment, and experiments…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
