Anytime Minibatch with Delayed Gradients

Haider Al-Lawati; Stark C. Draper

arXiv:2012.08616·cs.DC·December 17, 2020

Anytime Minibatch with Delayed Gradients

Haider Al-Lawati, Stark C. Draper

PDF

TL;DR

This paper introduces AMB-DG, a distributed optimization method that effectively uses stale gradients with a variable minibatch scheme, achieving optimal convergence rates and faster wall clock convergence in distributed settings.

Contribution

The paper proposes AMB-DG, a novel asynchronous distributed optimization algorithm that leverages delayed gradients with a variable minibatch approach, providing theoretical guarantees and empirical improvements.

Findings

01

AMB-DG achieves optimal regret bounds for convex smooth functions.

02

AMB-DG converges faster than AMB and fixed minibatch methods in experiments.

03

AMB-DG reduces idle time and improves wall clock time convergence in distributed systems.

Abstract

Distributed optimization is widely deployed in practice to solve a broad range of problems. In a typical asynchronous scheme, workers calculate gradients with respect to out-of-date optimization parameters while the master uses stale (i.e., delayed) gradients to update the parameters. While using stale gradients can slow the convergence, asynchronous methods speed up the overall optimization with respect to wall clock time by allowing more frequent updates and reducing idling times. In this paper, we present a variable per-epoch minibatch scheme called Anytime Minibatch with Delayed Gradients (AMB-DG). In AMB-DG, workers compute gradients in epochs of a fixed time while the master uses stale gradients to update the optimization parameters. We analyze AMB-DG in terms of its regret bound and convergence rate. We prove that for convex smooth objective functions, AMB-DG achieves the optimal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.