Decentralized Stochastic Gradient Tracking for Non-convex Empirical Risk   Minimization

Jiaqi Zhang; Keyou You

arXiv:1909.02712·cs.LG·August 31, 2020·25 cites

Decentralized Stochastic Gradient Tracking for Non-convex Empirical Risk Minimization

Jiaqi Zhang, Keyou You

PDF

Open Access

TL;DR

This paper introduces a decentralized stochastic gradient tracking algorithm for non-convex empirical risk minimization, demonstrating convergence properties, network independence, and linear speedup, with empirical validation on neural networks and logistic regression.

Contribution

It extends DSGT to non-convex problems, providing convergence analysis, network independence results, and empirical validation, which were not previously available.

Findings

01

Convergence rate depends on network connectivity, mini-batch size, and gradient variance.

02

DSGT achieves network independence in convergence rate under certain conditions.

03

Linear speedup with respect to the number of nodes is possible.

Abstract

This paper studies a decentralized stochastic gradient tracking (DSGT) algorithm for non-convex empirical risk minimization problems over a peer-to-peer network of nodes, which is in sharp contrast to the existing DSGT only for convex problems. To ensure exact convergence and handle the variance among decentralized datasets, each node performs a stochastic gradient (SG) tracking step by using a mini-batch of samples, where the batch size is designed to be proportional to the size of the local dataset. We explicitly evaluate the convergence rate of DSGT with respect to the number of iterations in terms of algebraic connectivity of the network, mini-batch size, gradient variance, etc. Under certain conditions, we further show that DSGT has a network independence property in the sense that the network topology only affects the convergence rate up to a constant factor. Hence, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Distributed Control Multi-Agent Systems

MethodsStochastic Gradient Descent · Logistic Regression