Dual-Free Stochastic Decentralized Optimization with Variance Reduction

Hadrien Hendrikx; Francis Bach; Laurent Massouli\'e

arXiv:2006.14384·math.OC·June 26, 2020·6 cites

Dual-Free Stochastic Decentralized Optimization with Variance Reduction

Hadrien Hendrikx, Francis Bach, Laurent Massouli\'e

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces DVR, a decentralized stochastic optimization algorithm with variance reduction that achieves near-linear speedup using only local stochastic gradients, suitable for distributed machine learning.

Contribution

The paper presents DVR, a novel decentralized variance-reduced stochastic algorithm that is computationally efficient and does not require expensive oracles, improving scalability in distributed settings.

Findings

01

DVR achieves near-linear speedup proportional to the number of nodes.

02

The accelerated version of DVR improves convergence speed.

03

Simulations demonstrate DVR's effectiveness on real data.

Abstract

We consider the problem of training machine learning models on distributed data in a decentralized way. For finite-sum problems, fast single-machine algorithms for large datasets rely on stochastic updates combined with variance reduction. Yet, existing decentralized stochastic algorithms either do not obtain the full speedup allowed by stochastic updates, or require oracles that are more expensive than regular gradients. In this work, we introduce a Decentralized stochastic algorithm with Variance Reduction called DVR. DVR only requires computing stochastic gradients of the local functions, and is computationally as fast as a standard stochastic variance-reduced algorithms run on a $1/ n$ fraction of the dataset, where $n$ is the number of nodes. To derive DVR, we use Bregman coordinate descent on a well-chosen dual problem, and obtain a dual-free algorithm using a specific Bregman…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

HadrienHx/DVR_NeurIPS
tfOfficial

Videos

Dual-Free Stochastic Decentralized Optimization with Variance Reduction· slideslive

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Distributed Control Multi-Agent Systems