Sample Optimality and All-for-all Strategies in Personalized Federated and Collaborative Learning
Mathieu Even, Laurent Massouli\'e, Kevin Scaman

TL;DR
This paper establishes fundamental limits and proposes optimal strategies for personalized federated learning, enabling agents to efficiently minimize local loss functions through gradient filtering based on data distribution similarities.
Contribution
It introduces information-theoretic lower bounds and matching strategies for sample efficiency in personalized federated learning, utilizing gradient filtering techniques.
Findings
Derived lower bounds on sample complexity for personalized federated learning.
Proposed gradient filtering strategies that achieve these bounds.
Validated strategies in all-for-one and all-for-all settings.
Abstract
In personalized Federated Learning, each member of a potentially large set of agents aims to train a model minimizing its loss function averaged over its local data distribution. We study this problem under the lens of stochastic optimization. Specifically, we introduce information-theoretic lower bounds on the number of samples required from all agents to approximately minimize the generalization error of a fixed agent. We then provide strategies matching these lower bounds, in the all-for-one and all-for-all settings where respectively one or all agents desire to minimize their own local function. Our strategies are based on a gradient filtering approach: provided prior knowledge on some notions of distances or discrepancies between local data distributions or functions, a given agent filters and aggregates stochastic gradients received from other agents, in order to achieve an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Domain Adaptation and Few-Shot Learning
