A Best-of-Both-Worlds Proof for Tsallis-INF without Fenchel Conjugates

Wei-Cheng Lee; Francesco Orabona

arXiv:2511.11211·cs.LG·November 17, 2025

A Best-of-Both-Worlds Proof for Tsallis-INF without Fenchel Conjugates

Wei-Cheng Lee, Francesco Orabona

PDF

Open Access

TL;DR

This paper provides a simplified proof for the Tsallis-INF algorithm's best-of-both-worlds guarantees in multi-armed bandits, avoiding conjugate functions and focusing on clarity over constant optimization.

Contribution

It offers a new, streamlined derivation of the guarantees for Tsallis-INF, utilizing modern convex optimization tools and simplifying the proof structure.

Findings

01

Simplified proof of Tsallis-INF guarantees

02

Avoids use of conjugate functions in proof

03

Focuses on clarity over constant optimization

Abstract

In this short note, we present a simple derivation of the best-of-both-world guarantee for the Tsallis-INF multi-armed bandit algorithm from J. Zimmert and Y. Seldin. Tsallis-INF: An optimal algorithm for stochastic and adversarial bandits. Journal of Machine Learning Research, 22(28):1-49, 2021. URL https://jmlr.csail.mit.edu/papers/volume22/19-753/19-753.pdf. In particular, the proof uses modern tools from online convex optimization and avoid the use of conjugate functions. Also, we do not optimize the constants in the bounds in favor of a slimmer proof.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Gaussian Processes and Bayesian Inference · Stochastic Gradient Optimization Techniques