Better Bounds for the Distributed Experts Problem
David P. Woodruff, Samson Zhou

TL;DR
This paper introduces a new protocol for the distributed experts problem that achieves lower regret bounds with efficient communication, improving upon previous methods in distributed online learning.
Contribution
The paper presents a novel protocol that reduces regret bounds and communication costs for the distributed experts problem, advancing the state-of-the-art in distributed online learning.
Findings
Achieves regret roughly proportional to 1 over square root of T with polylog factors.
Uses communication complexity of O((n + s)/R^2) times polylog factors.
Improves upon previous bounds in distributed experts problem.
Abstract
In this paper, we study the distributed experts problem, where experts are distributed across servers for timesteps. The loss of each expert at each time is the norm of the vector that consists of the losses of the expert at each of the servers at time . The goal is to minimize the regret , i.e., the loss of the distributed protocol compared to the loss of the best expert, amortized over the all times, while using the minimum amount of communication. We give a protocol that achieves regret roughly , using bits of communication, which improves on previous work.
Peer Reviews
Decision·ICLR 2026 Poster
Strength #1: The paper is extremely well written, communicating complicated ideas in an incremental fashion. The main ideas of the theory are presented in a manner that allow the proofs to be checked easily. Strength #2: The paper improves upon previous work by a) generalizing to arbitrary p-norms, b) attaining the regret-sensitive rate s/R^2 on the dependency on the number of servers, and c) giving an algorithm that improves the dependence on number of servers by a factor max(s^{1-2/p}) Stre
Weakness #1: The trick to trade-off communication with regret is somewhat artificial, requiring that all servers are silent on some rounds. While the theory works out, I wonder if there is a more natural algorithm that would yield further improvement. Weakness #2: This is largely a theory paper exploring communication-regret trade-offs in expert learning. It may be a little outside the primary areas of interest for ICLR.
- This is the first work to analyze distributed experts in the coordinator model for general $\ell_p$ losses. - The embedding of $\ell_p$ losses into $\ell_\infty$ through exponential random variables, combined with a geometric mean estimator for variance reduction, represents a technically sophisticated and creative approach. - The presentation of three successive algorithms (Algorithms 2--4), a warm-up, a parameterized version that achieves regret-communication tradeoff, and the fin
- The authors state that it is information-theoretically impossible to achieve regret smaller than $O(1/\sqrt{T})$, but then compare their method to the algorithm of [1] for $R = O(1)$, claiming improved communication in that regime. This causes confusion as $R = O(1)$ is not achievable. Instead, taking $R \ge 1/\sqrt{T}$ seems to reproduce the communication cost of [1], showing no improvement compared to [1]. The claimed improvement in communication complexity should be either removed or clarif
I found the paper to be quite well motivated, since the problem feels motivated from both a theoretical and practical perspective. I think that the algorithmic ideas in the paper are nice and appear to be a natural approach for the problem.
My main concern about the paper is that there are some important steps in the proof where I have doubts about the correctness of the argument. (1) This is probably my most important concern and is about the important lemma 3.3. The proof is quite sketchy. It is stated that a conditional expectation (l755) equality holds for all realizations of $p_t$. Realizations of what? I assume of $p(t)$, but what is $p(t)$? I assume the probability vector of playing each expert at time $t$ because then the
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimization and Search Problems · Advanced Bandit Algorithms Research · Game Theory and Applications
