Dual-Free Stochastic Decentralized Optimization with Variance Reduction
Hadrien Hendrikx, Francis Bach, Laurent Massouli\'e

TL;DR
This paper introduces DVR, a decentralized stochastic optimization algorithm with variance reduction that achieves near-linear speedup using only local stochastic gradients, suitable for distributed machine learning.
Contribution
The paper presents DVR, a novel decentralized variance-reduced stochastic algorithm that is computationally efficient and does not require expensive oracles, improving scalability in distributed settings.
Findings
DVR achieves near-linear speedup proportional to the number of nodes.
The accelerated version of DVR improves convergence speed.
Simulations demonstrate DVR's effectiveness on real data.
Abstract
We consider the problem of training machine learning models on distributed data in a decentralized way. For finite-sum problems, fast single-machine algorithms for large datasets rely on stochastic updates combined with variance reduction. Yet, existing decentralized stochastic algorithms either do not obtain the full speedup allowed by stochastic updates, or require oracles that are more expensive than regular gradients. In this work, we introduce a Decentralized stochastic algorithm with Variance Reduction called DVR. DVR only requires computing stochastic gradients of the local functions, and is computationally as fast as a standard stochastic variance-reduced algorithms run on a fraction of the dataset, where is the number of nodes. To derive DVR, we use Bregman coordinate descent on a well-chosen dual problem, and obtain a dual-free algorithm using a specific Bregman…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Distributed Control Multi-Agent Systems
