Constant regret for sequence prediction with limited advice

El Mehdi Saad (LMO); G. Blanchard (LMO; DATASHAPE)

arXiv:2210.02256·math.ST·October 6, 2022·ALT

Constant regret for sequence prediction with limited advice

El Mehdi Saad (LMO), G. Blanchard (LMO, DATASHAPE)

PDF

Open Access

TL;DR

This paper introduces a new sequence prediction strategy that achieves constant regret by observing multiple experts per round, improving over traditional methods that have regret depending on the horizon length.

Contribution

It proposes a novel approach allowing the learner to observe and predict using multiple experts simultaneously, leading to near-optimal constant regret bounds in limited advice settings.

Findings

01

Allowing two experts to be observed per round yields constant regret.

02

The proposed strategy is optimal up to a logarithmic factor in the number of experts.

03

Single expert observation per round results in slow, square-root regret.

Abstract

We investigate the problem of cumulative regret minimization for individual sequence prediction with respect to the best expert in a finite family of size K under limited access to information. We assume that in each round, the learner can predict using a convex combination of at most p experts for prediction, then they can observe a posteriori the losses of at most m experts. We assume that the loss function is range-bounded and exp-concave. In the standard multi-armed bandits setting, when the learner is allowed to play only one expert per round and observe only its feedback, known optimal regret bounds are of the order O( $\sqrt$ KT). We show that allowing the learner to play one additional expert per round and observe one additional feedback improves substantially the guarantees on regret. We provide a strategy combining only p = 2 experts per round for prediction and observing m…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Distributed Sensor Networks and Detection Algorithms