Adversarial Online Learning with Temporal Feedback Graphs

Khashayar Gatmiry; Jon Schneider

arXiv:2407.00571·cs.LG·July 2, 2024

Adversarial Online Learning with Temporal Feedback Graphs

Khashayar Gatmiry, Jon Schneider

PDF

Open Access

TL;DR

This paper introduces a new online learning algorithm that leverages temporal feedback graphs to improve decision-making, providing tight bounds and efficient implementation for transitive graphs.

Contribution

It proposes a novel partitioning strategy for losses based on feedback graph structure and establishes tight regret bounds, especially for transitive feedback graphs.

Findings

01

Algorithm achieves optimal regret bounds for transitive feedback graphs.

02

Lower bounds are tight and nearly optimal in practical settings.

03

Efficient implementation of the algorithm is demonstrated for specific graph classes.

Abstract

We study a variant of prediction with expert advice where the learner's action at round $t$ is only allowed to depend on losses on a specific subset of the rounds (where the structure of which rounds' losses are visible at time $t$ is provided by a directed "feedback graph" known to the learner). We present a novel learning algorithm for this setting based on a strategy of partitioning the losses across sub-cliques of this graph. We complement this with a lower bound that is tight in many practical settings, and which we conjecture to be within a constant factor of optimal. For the important class of transitive feedback graphs, we prove that this algorithm is efficiently implementable and obtains the optimal regret bound (up to a universal constant).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Machine Learning and Algorithms · Domain Adaptation and Few-Shot Learning