UPA: Unsupervised Prompt Agent via Tree-Based Search and Selection

Siran Peng; Weisong Zhao; Tianyu Fu; Chenxu Zhao; Tianshuo Zhang; Haoyuan Zhang; Xiangyu Zhu; Minghui Wu; Zhen Lei

arXiv:2601.23273·cs.CL·May 12, 2026

UPA: Unsupervised Prompt Agent via Tree-Based Search and Selection

Siran Peng, Weisong Zhao, Tianyu Fu, Chenxu Zhao, Tianshuo Zhang, Haoyuan Zhang, Xiangyu Zhu, Minghui Wu, Zhen Lei

PDF

TL;DR

UPA introduces an unsupervised method for prompt optimization that constructs a tree-based search guided by LLM comparisons, outperforming existing methods without requiring ground-truth rewards.

Contribution

It proposes a novel unsupervised prompt agent that uses a structured search and selection framework based on pairwise comparisons and the BTL model, eliminating the need for supervised rewards.

Findings

01

UPA outperforms existing prompt optimization methods across multiple tasks.

02

The two-stage framework effectively filters and selects high-quality prompts without ground-truth rewards.

03

Tree-based search guided by LLM comparisons is highly effective for prompt discovery.

Abstract

Prompt agents have recently emerged as a promising paradigm for automated prompt optimization, framing prompt discovery as a sequential decision-making problem over a structured prompt space. While this formulation enables the use of advanced planning algorithms, these methods typically assume access to supervised reward signals, which are often unavailable in practical scenarios. In this work, we propose UPA, an Unsupervised Prompt Agent that realizes structured search and selection without relying on ground-truth (GT) rewards. Specifically, during search, UPA iteratively constructs an evolving tree structure to navigate the prompt space, guided by fine-grained and position-debiased pairwise comparisons from Large Language Models (LLMs). Crucially, as these local comparisons do not inherently yield a consistent global scale, we decouple systematic prompt exploration from final…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.