How Confident Is the First Token? An Uncertainty-Calibrated Prompt Optimization Framework for Large Language Model Classification and Understanding
Wei Chen, Guoyang Ju, Yuanyuan Qi

TL;DR
This paper introduces a novel uncertainty measurement method called Log-Scale Focal Uncertainty (LSFU) for large language models, enabling more reliable prompt optimization by distinguishing true confidence from prior-induced spurious confidence.
Contribution
The paper proposes LSFU, a first-token-based uncertainty metric that incorporates class priors, and develops UCPOF, an uncertainty-calibrated prompt optimization framework that improves accuracy and reduces computational costs.
Findings
UCPOF improves accuracy by 6.03% over few-shot baselines.
UCPOF surpasses full RAG by 5.75% in average accuracy.
UCPOF reduces retrieval trigger rate by 50.66%.
Abstract
With the widespread adoption of large language models (LLMs) in natural language processing, prompt engineering and retrieval-augmented generation (RAG) have become mainstream to enhance LLMs' performance on complex tasks. However, LLMs generate outputs autoregressively, leading to inevitable output uncertainty. Since model performance is highly sensitive to prompt design, precise uncertainty measurement is crucial for reliable prompt optimization. For multi-class multiple-choice (understanding) tasks, conventional uncertainty measures (e.g., entropy) based on output probabilities treat all classes equally and ignore class prior differences in pretraining corpora. This failure to distinguish spurious confidence (from priors) from true certainty (from contextual understanding) results in poor confidence calibration. To address this, we propose Log-Scale Focal Uncertainty (LSFU), a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Data Classification
