Self-Concordant Perturbations for Linear Bandits

Lucas L\'evy; Jean-Lou Valeau; Arya Akhavan; Patrick Rebeschini

arXiv:2510.24187·stat.ML·February 13, 2026

Self-Concordant Perturbations for Linear Bandits

Lucas L\'evy, Jean-Lou Valeau, Arya Akhavan, Patrick Rebeschini

PDF

TL;DR

This paper introduces a unified framework for adversarial linear bandits using self-concordant perturbations, leading to a new algorithm that improves regret bounds, especially in high-dimensional settings.

Contribution

It extends the connection between FTRL and FTPL methods with self-concordant perturbations, resulting in a novel algorithm with improved regret bounds in linear bandit problems.

Findings

01

Achieves regret of O(d√(n log n)) on hypercube and ℓ₂ ball.

02

Matches SCRiBLe rate on ℓ₂ ball.

03

Improves regret bounds by √d on the hypercube.

Abstract

We consider the adversarial linear bandits setting and present a unified algorithmic framework that bridges Follow-the-Regularized-Leader (FTRL) and Follow-the-Perturbed-Leader (FTPL) methods, extending the known connection between them from the full-information setting. Within this framework, we introduce self-concordant perturbations, a family of probability distributions that mirror the role of self-concordant barriers previously employed in the FTRL-based SCRiBLe algorithm. Using this idea, we design a novel FTPL-based algorithm that combines self-concordant regularization with efficient stochastic exploration. Our approach achieves a regret of $O (d n ln n)$ on both the $d$ -dimensional hypercube and the $ℓ_{2}$ ball. On the $ℓ_{2}$ ball, this matches the rate attained by SCRiBLe. For the hypercube, this represents a $d$ improvement over these methods and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.