Tight Time Complexities in Parallel Stochastic Optimization with   Arbitrary Computation Dynamics

Alexander Tyurin

arXiv:2408.04929·math.OC·February 7, 2025

Tight Time Complexities in Parallel Stochastic Optimization with Arbitrary Computation Dynamics

Alexander Tyurin

PDF

1 Video

TL;DR

This paper establishes tight lower bounds on the time complexities of distributed stochastic optimization algorithms under arbitrary and realistic computation dynamics, providing a universal framework that applies to many existing methods.

Contribution

It introduces a universal computation model capturing real-world computation variability and proves tight lower bounds that match the performance of optimal algorithms.

Findings

01

Universal computation model for real-world distributed systems

02

Tight lower bounds applicable to synchronous and asynchronous methods

03

Optimal algorithms that match these bounds

Abstract

In distributed stochastic optimization, where parallel and asynchronous methods are employed, we establish optimal time complexities under virtually any computation behavior of workers/devices/CPUs/GPUs, capturing potential disconnections due to hardware and network delays, time-varying computation powers, and any possible fluctuations and trends of computation speeds. These real-world scenarios are formalized by our new universal computation model. Leveraging this model and new proof techniques, we discover tight lower bounds that apply to virtually all synchronous and asynchronous methods, including Minibatch SGD, Asynchronous SGD (Recht et al., 2011), and Picky SGD (Cohen et al., 2021). We show that these lower bounds, up to constant factors, are matched by the optimal Rennala SGD and Malenia SGD methods (Tyurin & Richt\'arik, 2023).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Tight Time Complexities in Parallel Stochastic Optimization with Arbitrary Computation Dynamics· slideslive