Mime: Mimicking Centralized Stochastic Algorithms in Federated Learning

Sai Praneeth Karimireddy; Martin Jaggi; Satyen Kale; Mehryar Mohri,; Sashank J. Reddi; Sebastian U. Stich; Ananda Theertha Suresh

arXiv:2008.03606·cs.LG·June 9, 2021·94 cites

Mime: Mimicking Centralized Stochastic Algorithms in Federated Learning

Sai Praneeth Karimireddy, Martin Jaggi, Satyen Kale, Mehryar Mohri,, Sashank J. Reddi, Sebastian U. Stich, Ananda Theertha Suresh

PDF

Open Access 1 Repo

TL;DR

Mime is a novel federated learning framework that adapts centralized optimization algorithms to the federated setting, effectively mitigating client drift and achieving faster convergence than centralized methods through theoretical guarantees and empirical validation.

Contribution

Mime introduces a general framework that mimics centralized algorithms in federated learning, providing convergence guarantees and demonstrating superior speed with momentum-based variance reduction.

Findings

01

Mime effectively reduces client drift in federated learning.

02

Mime achieves faster convergence than centralized algorithms with momentum-based variance reduction.

03

Experimental results confirm Mime's superior performance on real-world datasets.

Abstract

Federated learning (FL) is a challenging setting for optimization due to the heterogeneity of the data across different clients which gives rise to the client drift phenomenon. In fact, obtaining an algorithm for FL which is uniformly better than simple centralized training has been a major open problem thus far. In this work, we propose a general algorithmic framework, Mime, which i) mitigates client drift and ii) adapts arbitrary centralized optimization algorithms such as momentum and Adam to the cross-device federated learning setting. Mime uses a combination of control-variates and server-level statistics (e.g. momentum) at every client-update step to ensure that each local update mimics that of the centralized method run on iid data. We prove a reduction result showing that Mime can translate the convergence of a generic algorithm in the centralized setting into convergence in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

google-research/public-data-in-dpfl
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Cryptography and Data Security

MethodsAdam · Stochastic Gradient Descent