Improving Probability-based Prompt Selection Through Unified Evaluation and Analysis
Sohee Yang, Jonghyeon Kim, Joel Jang, Seonghyeon Ye, Hyunji Lee,, Minjoon Seo

TL;DR
This paper introduces a unified framework for probability-based prompt selection in large language models, revealing their connection to mutual information maximization and proposing a calibration method to enhance performance across diverse NLP tasks.
Contribution
It unifies existing prompt selection methods under a mutual information perspective and proposes a novel calibration technique to improve prompt selection accuracy.
Findings
Enhanced prompt selection accuracy from 87.79% to 96.85%.
Unified interpretation of existing methods as mutual information maximization.
Proposed calibration method increases effectiveness without calibration to 99.44% of oracle performance.
Abstract
Previous works in prompt engineering for large language models have introduced different gradient-free probability-based prompt selection methods that aim to choose the optimal prompt among the candidates for a given task but have failed to provide a comprehensive and fair comparison between each other. In this paper, we propose a unified framework to interpret and evaluate the existing probability-based prompt selection methods by performing extensive experiments on 13 common and diverse NLP tasks. We find that each of the existing methods can be interpreted as some variant of the method that maximizes mutual information between the input and the predicted output (MI). Utilizing this finding, we develop several other combinatorial variants of MI and increase the effectiveness of the oracle prompt selection method from 87.79% to 94.98%, measured as the ratio of the performance of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
Methodsfail
