# No-Regret Linear Bandits beyond Realizability

**Authors:** Chong Liu, Ming Yin, Yu-Xiang Wang

arXiv: 2302.13252 · 2023-07-21

## TL;DR

This paper introduces a new model of misspecification in linear bandits that depends on the suboptimality gap, and shows that the classical LinUCB algorithm remains robust under this model, achieving near-optimal regret.

## Contribution

It proposes a gap-dependent misspecification model for linear bandits and demonstrates that LinUCB is robust and effective under this new framework.

## Key findings

- LinUCB achieves near-optimal $\
- The new model captures realistic misspecification scenarios where errors are proportional to suboptimality gaps.
- A novel self-bounding proof technique is developed to analyze regret under misspecification.

## Abstract

We study linear bandits when the underlying reward function is not linear. Existing work relies on a uniform misspecification parameter $\epsilon$ that measures the sup-norm error of the best linear approximation. This results in an unavoidable linear regret whenever $\epsilon > 0$. We describe a more natural model of misspecification which only requires the approximation error at each input $x$ to be proportional to the suboptimality gap at $x$. It captures the intuition that, for optimization problems, near-optimal regions should matter more and we can tolerate larger approximation errors in suboptimal regions. Quite surprisingly, we show that the classical LinUCB algorithm -- designed for the realizable case -- is automatically robust against such gap-adjusted misspecification. It achieves a near-optimal $\sqrt{T}$ regret for problems that the best-known regret is almost linear in time horizon $T$. Technically, our proof relies on a novel self-bounding argument that bounds the part of the regret due to misspecification by the regret itself.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.13252/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/2302.13252/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/2302.13252/full.md

---
Source: https://tomesphere.com/paper/2302.13252