Parameter-Free Dynamic Regret for Unconstrained Linear Bandits

Alberto Rumi; Andrew Jacobsen; Nicol\`o Cesa-Bianchi; Fabio Vitale

arXiv:2603.25916·cs.LG·March 30, 2026

Parameter-Free Dynamic Regret for Unconstrained Linear Bandits

Alberto Rumi, Andrew Jacobsen, Nicol\`o Cesa-Bianchi, Fabio Vitale

PDF

TL;DR

This paper introduces a parameter-free algorithm for unconstrained adversarial linear bandits that adaptively minimizes dynamic regret without prior knowledge of comparator switches, achieving optimal bounds.

Contribution

It presents the first algorithm for linear bandits that attains optimal regret bounds of order √d(1+S_T)T without knowing the number of comparator switches in advance.

Findings

01

Achieves regret of order √d(1+S_T)T up to poly-logarithmic factors.

02

Provides a simple method to combine guarantees of multiple bandit algorithms.

03

Resolves a long-standing open problem in adaptive regret minimization.

Abstract

We study dynamic regret minimization in unconstrained adversarial linear bandit problems. In this setting, a learner must minimize the cumulative loss relative to an arbitrary sequence of comparators $u_{1}, \dots, u_{T}$ in $R^{d}$ , but receives only point-evaluation feedback on each round. We provide a simple approach to combining the guarantees of several bandit algorithms, allowing us to optimally adapt to the number of switches $S_{T} = \sum_{t} I {u_{t} \neq = u_{t - 1}}$ of an arbitrary comparator sequence. In particular, we provide the first algorithm for linear bandits achieving the optimal regret guarantee of order $O (d (1 + S_{T}) T)$ up to poly-logarithmic terms without prior knowledge of $S_{T}$ , thus resolving a long-standing open problem.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.