Cooperative Online Learning with Feedback Graphs
Nicol\`o Cesa-Bianchi, Tommaso R. Cesari, Riccardo Della Vecchia

TL;DR
This paper investigates how communication and feedback structures influence cooperative online learning, providing bounds on regret based on graph properties and validating results through experiments.
Contribution
It introduces a unified analysis framework linking feedback graphs and communication networks, deriving regret bounds and establishing lower bounds for cooperative online learning.
Findings
Regret bounds depend on the independence number of the strong product graph.
The analysis generalizes many existing bounds for expert and bandit feedback.
Experimental results support the theoretical regret bounds.
Abstract
We study the interplay between communication and feedback in a cooperative online learning setting, where a network of communicating agents learn a common sequential decision-making task through a feedback graph. We bound the network regret in terms of the independence number of the strong product between the communication network and the feedback graph. Our analysis recovers as special cases many previously known bounds for cooperative online learning with expert or bandit feedback. We also prove an instance-based lower bound, demonstrating that our positive results are not improvable except in pathological cases. Experiments on synthetic data confirm our theoretical findings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Distributed Sensor Networks and Detection Algorithms · Cognitive Radio Networks and Spectrum Sensing
