Learning Under Delayed Feedback: Implicitly Adapting to Gradient Delays

Rotem Zamir Aviv (1); Ido Hakimi (2); Assaf Schuster (2); Kfir Y. Levy; (1; 3) ((1) Department of Electrical; Computer Engineering; Technion,; (2) Department of Computer Science; Technion; (3) A Viterbi Fellow)

arXiv:2106.12261·cs.LG·June 24, 2021

Learning Under Delayed Feedback: Implicitly Adapting to Gradient Delays

Rotem Zamir Aviv (1), Ido Hakimi (2), Assaf Schuster (2), Kfir Y. Levy, (1, 3) ((1) Department of Electrical, Computer Engineering, Technion,, (2) Department of Computer Science, Technion, (3) A Viterbi Fellow)

PDF

Open Access

TL;DR

This paper introduces a robust stochastic convex optimization method that adaptively handles asynchronous gradient delays without prior knowledge, suitable for dynamic shared-resource environments.

Contribution

The proposed method is the first to implicitly adapt to changing gradient delays in asynchronous optimization without requiring prior delay or smoothness information.

Findings

01

Provides non-asymptotic convergence guarantees.

02

Handles dynamic and unknown gradient delays effectively.

03

Suitable for cloud and data center environments.

Abstract

We consider stochastic convex optimization problems, where several machines act asynchronously in parallel while sharing a common memory. We propose a robust training method for the constrained setting and derive non asymptotic convergence guarantees that do not depend on prior knowledge of update delays, objective smoothness, and gradient variance. Conversely, existing methods for this setting crucially rely on this prior knowledge, which render them unsuitable for essentially all shared-resources computational environments, such as clouds and data centers. Concretely, existing approaches are unable to accommodate changes in the delays which result from dynamic allocation of the machines, while our method implicitly adapts to such changes.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Age of Information Optimization