A Further Efficient Algorithm with Best-of-Both-Worlds Guarantees for $m$-Set Semi-Bandit Problem

Botao Chen; Jongyeong Lee; Chansoo Kim; Junya Honda

arXiv:2603.11764·cs.LG·March 13, 2026

A Further Efficient Algorithm with Best-of-Both-Worlds Guarantees for $m$-Set Semi-Bandit Problem

Botao Chen, Jongyeong Lee, Chansoo Kim, Junya Honda

PDF

Open Access

TL;DR

This paper demonstrates that FTPL with geometric resampling achieves optimal regret bounds in both adversarial and stochastic settings for m-set semi-bandit problems, offering a computationally efficient algorithm with best-of-both-worlds guarantees.

Contribution

It extends FTPL analysis with geometric resampling to m-set semi-bandits, proving optimal regret bounds and improving computational efficiency.

Findings

01

FTPL with specific distributions achieves optimal regret of O(√mdT) in adversarial settings.

02

FTPL with certain parameters attains logarithmic regret in stochastic settings.

03

Conditional geometric resampling reduces complexity from O(d^2) to O(md(log(d/m)+1)).

Abstract

This paper studies the optimality and complexity of Follow-the-Perturbed-Leader (FTPL) policy in $m$ -set semi-bandit problems. FTPL has been studied extensively as a promising candidate of an efficient algorithm with favorable regret for adversarial combinatorial semi-bandits. Nevertheless, the optimality of FTPL has still been unknown unlike Follow-the-Regularized-Leader (FTRL) whose optimality has been proved for various tasks of online learning. In this paper, we extend the analysis of FTPL with geometric resampling (GR) to $m$ -set semi-bandits, which is a special case of combinatorial semi-bandits, showing that FTPL with Fr\'{e}chet and Pareto distributions with certain parameters achieves the best possible regret of $O (m d T)$ in adversarial setting. We also show that FTPL with Fr\'{e}chet and Pareto distributions with a certain parameter achieves a logarithmic regret for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Optimization and Search Problems