Jump Start or False Start? A Theoretical and Empirical Evaluation of LLM-initialized Bandits

Adam Bayley; Xiaodan Zhu; Raquel Aoki; Yanshuai Cao; Kevin H. Wilson

arXiv:2604.02527·cs.LG·April 6, 2026

Jump Start or False Start? A Theoretical and Empirical Evaluation of LLM-initialized Bandits

Adam Bayley, Xiaodan Zhu, Raquel Aoki, Yanshuai Cao, Kevin H. Wilson

PDF

TL;DR

This paper evaluates the effectiveness of LLM-initialized bandits, revealing that their performance heavily depends on the alignment of LLM-generated preferences with actual user data, especially under noise and misalignment.

Contribution

It provides a theoretical framework and empirical analysis of when LLM-based warm-starts outperform cold-start bandits, considering noise and alignment issues.

Findings

01

Warm-starting remains effective up to 30% noise corruption.

02

Performance degrades significantly beyond 40-50% noise.

03

Systematic misalignment can cause higher regret than cold-starts.

Abstract

The recent advancement of Large Language Models (LLMs) offers new opportunities to generate user preference data to warm-start bandits. Recent studies on contextual bandits with LLM initialization (CBLI) have shown that these synthetic priors can significantly lower early regret. However, these findings assume that LLM-generated choices are reasonably aligned with actual user preferences. In this paper, we systematically examine how LLM-generated preferences perform when random and label-flipping noise is injected into the synthetic training data. For aligned domains, we find that warm-starting remains effective up to 30% corruption, loses its advantage around 40%, and degrades performance beyond 50%. When there is systematic misalignment, even without added noise, LLM-generated priors can lead to higher regret than a cold-start bandit. To explain these behaviors, we develop a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.