Beyond Bandit Feedback in Online Multiclass Classification
Dirk van der Hoeven, Federico Fusco, Nicol\`o Cesa-Bianchi

TL;DR
This paper introduces Gappletron, an online multiclass classification algorithm that effectively handles arbitrary feedback graphs, providing strong regret bounds and demonstrating competitive performance in synthetic experiments.
Contribution
We propose Gappletron, the first algorithm for online multiclass classification with arbitrary feedback graphs, and establish its theoretical regret bounds and practical competitiveness.
Findings
Regret bounds of order B√ρKT in expectation and high probability.
Constant surrogate regret of order B²K in full information setting.
Lower bound of order max{B²K, √T} showing near-optimality.
Abstract
We study the problem of online multiclass classification in a setting where the learner's feedback is determined by an arbitrary directed graph. While including bandit feedback as a special case, feedback graphs allow a much richer set of applications, including filtering and label efficient classification. We introduce Gappletron, the first online multiclass algorithm that works with arbitrary feedback graphs. For this new algorithm, we prove surrogate regret bounds that hold, both in expectation and with high probability, for a large class of surrogate losses. Our bounds are of order , where is the diameter of the prediction space, is the number of classes, is the time horizon, and is the domination number (a graph-theoretic parameter affecting the amount of exploration). In the full information case, we show that Gappletron achieves a constant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Data Stream Mining Techniques
