A Computation and Communication Efficient Method for Distributed Nonconvex Problems in the Partial Participation Setting
Alexander Tyurin, Peter Richt\'arik

TL;DR
This paper introduces a novel distributed optimization method that combines variance reduction, partial participation, and communication compression, achieving optimal complexity without requiring all nodes to participate or bounded gradients.
Contribution
The method is the first to integrate variance reduction with partial participation and compression, achieving optimal oracle and communication complexities in distributed nonconvex optimization.
Findings
Achieves optimal oracle complexity in partial participation setting.
Ensures state-of-the-art communication efficiency.
Does not require bounded gradients or full node participation.
Abstract
We present a new method that includes three key components of distributed optimization and federated learning: variance reduction of stochastic gradients, partial participation, and compressed communication. We prove that the new method has optimal oracle complexity and state-of-the-art communication complexity in the partial participation setting. Regardless of the communication compression feature, our method successfully combines variance reduction and partial participation: we get the optimal oracle complexity, never need the participation of all nodes, and do not require the bounded gradients (dissimilarity) assumption.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Cooperative Communication and Network Coding
