Understanding Unintended Memorization in Federated Learning
Om Thakkar, Swaroop Ramaswamy, Rajiv Mathews, Fran\c{c}oise Beaufays

TL;DR
This paper investigates how federated learning components influence unintended memorization in models, finding that data clustering, federated averaging, and differential privacy significantly reduce memorization of sensitive data.
Contribution
It provides a formal analysis of how federated learning components impact unintended memorization, highlighting the roles of data clustering, federated averaging, and differential privacy.
Findings
Data clustering reduces memorization.
Federated averaging further decreases memorization.
Differential privacy leads to models with minimal memorization.
Abstract
Recent works have shown that generative sequence models (e.g., language models) have a tendency to memorize rare or unique sequences in the training data. Since useful models are often trained on sensitive data, to ensure the privacy of the training data it is critical to identify and mitigate such unintended memorization. Federated Learning (FL) has emerged as a novel framework for large-scale distributed learning tasks. However, it differs in many aspects from the well-studied central learning setting where all the data is stored at the central server. In this paper, we initiate a formal study to understand the effect of different components of canonical FL on unintended memorization in trained models, comparing with the central learning setting. Our results show that several differing components of FL play an important role in reducing unintended memorization. Specifically, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Stochastic Gradient Optimization Techniques
