A Component-Based Survey of Interactions between Large Language Models and Multi-Armed Bandits
Siguang Chen, Chunli Lv, Miao Xie

TL;DR
This survey systematically reviews how large language models and multi-armed bandit algorithms interact at the component level, highlighting mutual benefits and analyzing existing systems to guide future research in adaptive decision-making and language understanding.
Contribution
It is the first comprehensive survey to analyze the bidirectional interactions between LLMs and MABs at the component level, providing insights and a literature index.
Findings
MAB algorithms help address LLM challenges like retrieval and personalization.
LLMs improve MAB systems by redefining core components such as environment modeling.
Analysis of existing LLM-enhanced bandit systems and their performance.
Abstract
Large language models (LLMs) have become powerful and widely used systems for language understanding and generation, while multi-armed bandit (MAB) algorithms provide a principled framework for adaptive decision-making under uncertainty. This survey explores the potential at the intersection of these two fields. As we know, it is the first survey to systematically review the bidirectional interaction between large language models and multi-armed bandits at the component level. We highlight the bidirectional benefits: MAB algorithms address critical LLM challenges, spanning from pre-training to retrieval-augmented generation (RAG) and personalization. Conversely, LLMs enhance MAB systems by redefining core components such as arm definition and environment modeling, thereby improving decision-making in sequential tasks. We analyze existing LLM-enhanced bandit systems and bandit-enhanced…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Artificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI)
