Online Learning with Feedback Graphs: Beyond Bandits

Noga Alon; Nicol\`o Cesa-Bianchi; Ofer Dekel; Tomer Koren

arXiv:1502.07617·cs.LG·February 27, 2015·55 cites

Online Learning with Feedback Graphs: Beyond Bandits

Noga Alon, Nicol\`o Cesa-Bianchi, Ofer Dekel, Tomer Koren

PDF

Open Access

TL;DR

This paper classifies feedback graphs in online learning problems and characterizes how their structure influences the minimax regret, extending previous work and connecting to partial monitoring games.

Contribution

It introduces a classification of feedback graphs into three classes and derives regret bounds for each, generalizing prior results and analyzing time-varying graphs.

Findings

01

Strongly observable graphs lead to (\u00b7^{1/2} T^{1/2}) regret.

02

Weakly observable graphs lead to (rac{}{3} T^{2/3}) regret.

03

Unobservable graphs result in linear regret.

Abstract

We study a general class of online learning problems where the feedback is specified by a graph. This class includes online prediction with expert advice and the multi-armed bandit problem, but also several learning problems where the online player does not necessarily observe his own loss. We analyze how the structure of the feedback graph controls the inherent difficulty of the induced $T$ -round learning problem. Specifically, we show that any feedback graph belongs to one of three classes: strongly observable graphs, weakly observable graphs, and unobservable graphs. We prove that the first class induces learning problems with $Θ (α^{1/2} T^{1/2})$ minimax regret, where $α$ is the independence number of the underlying graph; the second class induces problems with $Θ (δ^{1/3} T^{2/3})$ minimax regret, where $δ$ is the domination number of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Machine Learning and Algorithms