SPAM: Stochastic Proximal Point Method with Momentum Variance Reduction for Non-convex Cross-Device Federated Learning
Avetik Karagulyan, Egor Shulgin, Abdurakhmon Sadiev, Peter Richt\'arik

TL;DR
The paper introduces SPAM, a novel federated learning algorithm that effectively handles client drift and data similarity issues in cross-device settings with non-convex losses, even under partial participation.
Contribution
It presents the first algorithm that does not require objective smoothness and leverages data similarity, with sharp analysis under Hessian similarity for non-convex federated learning.
Findings
Proves convergence benefits from data similarity.
Handles partial client participation effectively.
Applicable to non-convex loss functions without smoothness.
Abstract
Cross-device training is a crucial subfield of federated learning, where the number of clients can reach into the billions. Standard approaches and local methods are prone to issues such as client drift and insensitivity to data similarities. We propose a novel algorithm (SPAM) for cross-device federated learning with non-convex losses, which solves both issues. We provide sharp analysis under second-order (Hessian) similarity, a condition satisfied by a variety of machine learning problems in practice. Additionally, we extend our results to the partial participation setting, where a cohort of selected clients communicate with the server at each communication round. Our method is the first in its kind, that does not require the smoothness of the objective and provably benefits from clients having similar data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Random Matrices and Applications
