A Reduction from Delayed to Immediate Feedback for Online Convex Optimization with Improved Guarantees

Alexander Ryabchenko; Idan Attias; Daniel M. Roy

arXiv:2602.02634·cs.LG·February 4, 2026

A Reduction from Delayed to Immediate Feedback for Online Convex Optimization with Improved Guarantees

Alexander Ryabchenko, Idan Attias, Daniel M. Roy

PDF

Open Access

TL;DR

This paper introduces a unified reduction framework for online convex optimization with delayed feedback, improving regret bounds for both bandit and first-order settings by handling delays more effectively.

Contribution

It presents a delay-adaptive reduction converting any online linear optimization algorithm into one that manages round-dependent delays with improved theoretical guarantees.

Findings

01

Achieves $O( ext{sqrt}(d_{tot}) + T^{3/4} ext{sqrt}(k))$ regret for bandit convex optimization.

02

Improves delay-dependent regret bounds from previous $O( ext{min} ext{ extunderscore} ext{sqrt}(T d_{max}), (Td_{tot})^{1/3})$ to $O( ext{sqrt}(d_{tot}))$.

03

Provides a simpler, unified analysis recovering state-of-the-art bounds for first-order feedback.

Abstract

We develop a reduction-based framework for online learning with delayed feedback that recovers and improves upon existing results for both first-order and bandit convex optimization. Our approach introduces a continuous-time model under which regret decomposes into a delay-independent learning term and a delay-induced drift term, yielding a delay-adaptive reduction that converts any algorithm for online linear optimization into one that handles round-dependent delays. For bandit convex optimization, we significantly improve existing regret bounds, with delay-dependent terms matching state-of-the-art first-order rates. For first-order feedback, we recover state-of-the-art regret bounds via a simpler, unified analysis. Quantitatively, for bandit convex optimization we obtain $O (d_{tot} + T^{\frac{3}{4}} k)$ regret, improving the delay-dependent term from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Age of Information Optimization · Stochastic Gradient Optimization Techniques