PMPO: Probabilistic Metric Prompt Optimization for Small and Large Language Models
Chenzhuo Zhao, Ziqian Liu, Xinda Wang, Junting Lu, Chaoyi Ruan

TL;DR
PMPO introduces a lightweight, loss-based prompt optimization framework that improves language model performance across sizes without requiring output sampling or human scoring.
Contribution
It presents a novel, unified prompt optimization method that uses token-level cross entropy and masking analysis, applicable to both small and large models.
Findings
Outperforms prior prompt optimizers on multiple benchmarks.
Achieves highest average accuracy on BBH.
Raises AlpacaEval 2.0 win rates by over 19 points.
Abstract
Prompt optimization is a practical and widely applicable alternative to fine tuning for improving large language model performance. Yet many existing methods evaluate candidate prompts by sampling full outputs, often coupled with self critique or human annotated preferences, which limits scalability, especially for smaller models or models that are not instruction tuned. We present PMPO (Probabilistic Metric Prompt Optimization), a unified framework that uses token level cross entropy as a direct, lightweight evaluation signal. PMPO locates low quality prompt segments via a masking based analysis and iteratively rewrites them to propose improved variants. Crucially, during evaluation, PMPO selects among variants by minimizing loss in a single forward pass, eliminating output sampling and human or judge based scoring for selection while still using standard generation only to propose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
