Revisiting Follow-the-Perturbed-Leader with Unbounded Perturbations in Bandit Problems

Jongyeong Lee; Junya Honda; Shinji Ito; Min-hwan Oh

arXiv:2508.18604·stat.ML·August 27, 2025

Revisiting Follow-the-Perturbed-Leader with Unbounded Perturbations in Bandit Problems

Jongyeong Lee, Junya Honda, Shinji Ito, Min-hwan Oh

PDF

1 Video

TL;DR

This paper advances the theoretical understanding of Follow-the-Perturbed-Leader (FTPL) algorithms in bandit problems by establishing Best-of-Both-Worlds guarantees for unbounded perturbations, including hybrid and symmetric Fréchet-type cases, and explores their limitations.

Contribution

It extends BOBW results for FTPL with broad unbounded perturbations, introduces new insights into perturbation design, and analyzes the connection between Tsallis entropy and Fréchet-type perturbations.

Findings

01

BOBW guarantees established for asymmetric unbounded Fréchet-type perturbations.

02

First BOBW guarantee for symmetric unbounded perturbations in two-armed bandits.

03

Limitations identified for symmetric Fréchet-type perturbations in multi-armed bandit settings.

Abstract

Follow-the-Regularized-Leader (FTRL) policies have achieved Best-of-Both-Worlds (BOBW) results in various settings through hybrid regularizers, whereas analogous results for Follow-the-Perturbed-Leader (FTPL) remain limited due to inherent analytical challenges. To advance the analytical foundations of FTPL, we revisit classical FTRL-FTPL duality for unbounded perturbations and establish BOBW results for FTPL under a broad family of asymmetric unbounded Fr\'echet-type perturbations, including hybrid perturbations combining Gumbel-type and Fr\'echet-type tails. These results not only extend the BOBW results of FTPL but also offer new insights into designing alternative FTPL policies competitive with hybrid regularization approaches. Motivated by earlier observations in two-armed bandits, we further investigate the connection between the $1/2$ -Tsallis entropy and a Fr\'echet-type…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Revisiting Follow-the-Perturbed-Leader with Unbounded Perturbations in Bandit Problems· slideslive