Recycling Scraps: Improving Private Learning by Leveraging Intermediate   Checkpoints

Virat Shejwalkar; Arun Ganesh; Rajiv Mathews; Yarong Mu; Shuang Song,; Om Thakkar; Abhradeep Thakurta; Xinyi Zheng

arXiv:2210.01864·cs.LG·September 18, 2024

Recycling Scraps: Improving Private Learning by Leveraging Intermediate Checkpoints

Virat Shejwalkar, Arun Ganesh, Rajiv Mathews, Yarong Mu, Shuang Song,, Om Thakkar, Abhradeep Thakurta, Xinyi Zheng

PDF

Open Access

TL;DR

This paper introduces a framework that leverages intermediate checkpoints during training to enhance the accuracy of differentially private machine learning models without additional privacy costs.

Contribution

It proposes a novel checkpoint aggregation method that improves DP ML accuracy and variance estimation, operating within a single training run.

Findings

01

Significant accuracy improvements on StackOverflow, CIFAR10, and CIFAR100 datasets.

02

Enhanced utility and reduced variance in proprietary production tasks.

03

Effective variance estimation from last few checkpoints under standard assumptions.

Abstract

In this work, we focus on improving the accuracy-variance trade-off for state-of-the-art differentially private machine learning (DP ML) methods. First, we design a general framework that uses aggregates of intermediate checkpoints \emph{during training} to increase the accuracy of DP ML techniques. Specifically, we demonstrate that training over aggregates can provide significant gains in prediction accuracy over the existing state-of-the-art for StackOverflow, CIFAR10 and CIFAR100 datasets. For instance, we improve the state-of-the-art DP StackOverflow accuracies to 22.74\% (+2.06\% relative) for $ϵ = 8.2$ , and 23.90\% (+2.09\%) for $ϵ = 18.9$ . Furthermore, these gains magnify in settings with periodically varying training data distributions. We also demonstrate that our methods achieve relative improvements of 0.54\% and 62.6\% in terms of utility and variance, on a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security