Federated Majorize-Minimization: Beyond Parameter Aggregation
Aymeric Dieuleveut, Gersende Fort, Mahmoud Hegazy, Hoi-To Wai

TL;DR
This paper introduces a unified stochastic optimization framework for federated learning that learns surrogate functions locally and aggregates them, improving robustness to data heterogeneity and communication constraints.
Contribution
It develops a novel federated algorithm based on Majorize-Minimization that learns surrogate functions locally, extending previous stochastic MM methods to federated settings.
Findings
Framework unifies various stochastic optimization algorithms.
Proposed ext{ extbackslash}QSMM ext{ } handles data heterogeneity and communication constraints.
Demonstrated flexibility by applying to federated optimal transport maps.
Abstract
This paper proposes a unified approach for designing stochastic optimization algorithms that robustly scale to the federated learning setting. Our work studies a class of Majorize-Minimization (MM) problems, which possesses a linearly parameterized family of majorizing surrogate functions. This framework encompasses (proximal) gradient-based algorithms for (regularized) smooth objectives, the Expectation Maximization algorithm, and many problems seen as variational surrogate MM. We show that our framework motivates a unifying algorithm called Stochastic Approximation Stochastic Surrogate MM (\SSMM), which includes previous stochastic MM procedures as special instances. We then extend \SSMM\ to the federated setting, while taking into consideration common bottlenecks such as data heterogeneity, partial participation, and communication constraints; this yields \QSMM. The originality of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Complexity and Algorithms in Graphs
