Last-Iterate Analyses of FTRL with the 1/2-Tsallis Entropy in Stochastic Bandits

Jingxin Zhan; Yuze Han; Zhihua Zhang

arXiv:2510.22819·cs.LG·May 4, 2026

Last-Iterate Analyses of FTRL with the 1/2-Tsallis Entropy in Stochastic Bandits

Jingxin Zhan, Yuze Han, Zhihua Zhang

PDF

TL;DR

This paper analyzes the last-iterate convergence of the FTRL algorithm with 1/2-Tsallis entropy in stochastic bandits, showing a decay rate of t^{-1/2} in Bregman divergence, linking regret and convergence.

Contribution

It provides the first theoretical analysis of the last-iterate convergence rate for FTRL with 1/2-Tsallis entropy in stochastic bandits.

Findings

01

Bregman divergence decays at a rate of t^{-1/2}.

02

Logarithmic regret implies a t^{-1} last-iterate convergence rate.

03

Partially confirms the intuition connecting regret and convergence rate.

Abstract

The convergence analysis of online learning algorithms is central to machine learning theory, where the last-iterate convergence is particularly important, as it captures the learner's actual decisions and describes the evolution of the learning process over time. However, in multi-armed bandits, most existing algorithmic analyses mainly focus on the order of regret, while the last-iterate (simple regret) convergence rate remains less explored -- especially for the widely studied Follow-the-Regularized-Leader (FTRL) algorithms. Recently, FTRL with the $1/2$ -Tsallis entropy regularizer $Ψ (p) = - 4 \sum_{i = 1}^{d} p_{i}$ (the $1/2$ -Tsallis-INF algorithm, by arXiv:1807.07623) was shown to achieve logarithmic regret in stochastic bandits. Nevertheless, its last-iterate convergence rate has not yet been studied. Intuitively, logarithmic regret should correspond to a $t^{- 1}$ last-iterate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.