TextBO: Bayesian Optimization in Language Space for Eval-Efficient Self-Improving AI
Enoch Hyunwook Kang, Hema Yoganarasimhan

TL;DR
TextBO introduces a novel language-space Bayesian optimization method for evaluation-efficient self-improving AI, outperforming existing approaches in automated tasks and agentic AI benchmarks.
Contribution
It extends Bayesian optimization to language models by combining textual gradients with Best-of-N selection, enabling evaluation-efficient self-improvement without explicit surrogates.
Findings
TextBO outperforms baselines like Best-of-N and GEPA in ad-alignment tasks.
TextBO significantly improves performance on agentic AI benchmarks.
The method operates purely in language space without explicit uncertainty models.
Abstract
Large Language Models (LLMs) have enabled self-improving AI systems that iteratively generate, evaluate, and refine their outcomes. Recent studies show that prompt-optimization-based self-improvement can outperform state-of-the-art reinforcement-learning fine-tuning of LLMs, but performance is typically measured by generation efficiency. However, in many applications, the constraint is evaluation efficiency: obtaining reliable feedback is far more costly than generating candidates. To optimize for evaluation efficiency, we extend Upper Confidence Bound-Bayesian Optimization (UCB-BO), a framework known for optimal evaluation-efficiency guarantees, to the language domain. Doing so is challenging for two reasons: (i) gradients needed for UCB-BO are ill-defined in discrete prompt space; and (ii) UCB-style exploration relies on a surrogate model and acquisition function, which only live…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Multimodal Machine Learning Applications
