When Do We Need LLMs? A Diagnostic for Language-Driven Bandits
Uljad Berdica, Fernando Acero, Anton Ipsen, Parisa Zehtabi, Michael Cashmore, Manuela Veloso

TL;DR
This paper compares LLM-based and lightweight numerical bandit algorithms for decision-making with textual and numerical context, proposing a diagnostic to determine when LLMs are necessary.
Contribution
It introduces LLMP-UCB, a method for uncertainty estimation from LLMs, and a geometric diagnostic to guide the choice between LLMs and lightweight bandits.
Findings
Lightweight numerical bandits match or outperform LLM-based solutions in accuracy.
Embedding dimensionality influences exploration-exploitation tradeoffs.
The diagnostic guides cost-effective deployment of decision systems.
Abstract
We study Contextual Multi-Armed Bandits (CMABs) for non-episodic sequential decision making problems where the context includes both textual and numerical information (e.g., recommendation systems, dynamic portfolio adjustments, offer selection; all frequent problems in finance). While Large Language Models (LLMs) are increasingly applied to these settings, utilizing LLMs for reasoning at every decision step is computationally expensive and uncertainty estimates are difficult to obtain. To address this, we introduce LLMP-UCB, a bandit algorithm that derives uncertainty estimates from LLMs via repeated inference. However, our experiments demonstrate that lightweight numerical bandits operating on text embeddings (dense or Matryoshka) match or exceed the accuracy of LLM-based solutions at a fraction of their cost. We further show that embedding dimensionality is a practical lever on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
