An Information-theoretic Approach to Prompt Engineering Without Ground   Truth Labels

Taylor Sorensen; Joshua Robinson; Christopher Michael Rytting,; Alexander Glenn Shaw; Kyle Jeffrey Rogers; Alexia Pauline Delorey; Mahmoud; Khalil; Nancy Fulda; David Wingate

arXiv:2203.11364·cs.CL·September 23, 2022·5 cites

An Information-theoretic Approach to Prompt Engineering Without Ground Truth Labels

Taylor Sorensen, Joshua Robinson, Christopher Michael Rytting,, Alexander Glenn Shaw, Kyle Jeffrey Rogers, Alexia Pauline Delorey, Mahmoud, Khalil, Nancy Fulda, David Wingate

PDF

Open Access

TL;DR

This paper proposes an information-theoretic method for prompt engineering that selects effective prompts without labeled data or direct model access, leveraging mutual information to improve task accuracy.

Contribution

It introduces a novel label-free prompt selection method based on mutual information, applicable without model access, and demonstrates its effectiveness across multiple NLP tasks.

Findings

01

High mutual information correlates with high task accuracy.

02

Achieves 90% of optimal prompt performance without labels.

03

Works across diverse NLP datasets and models.

Abstract

Pre-trained language models derive substantial linguistic and factual knowledge from the massive corpora on which they are trained, and prompt engineering seeks to align these models to specific tasks. Unfortunately, existing prompt engineering methods require significant amounts of labeled data, access to model parameters, or both. We introduce a new method for selecting prompt templates \textit{without labeled examples} and \textit{without direct access to the model}. Specifically, over a set of candidate templates, we choose the template that maximizes the mutual information between the input and the corresponding model output. Across 8 datasets representing 7 distinct NLP tasks, we show that when a template has high mutual information, it also has high accuracy on the task. On the largest model, selecting prompts with our method gets 90\% of the way from the average prompt accuracy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification