Can discrete information extraction prompts generalize across language models?
Nathana\"el Carraz Rakotonirina, Roberto Dess\`i, Fabio Petroni,, Sebastian Riedel, Marco Baroni

TL;DR
This paper investigates whether discrete prompts induced for one language model can generalize to others, proposing a training method that enhances cross-model transferability and analyzing properties of such prompts.
Contribution
It introduces a training approach mixing models to induce prompts that generalize across models and analyzes their linguistic and structural properties.
Findings
AutoPrompt prompts outperform manual prompts on slot-filling
Performance drops when prompts are transferred between models
Mixing models during training improves prompt generalization
Abstract
We study whether automatically-induced prompts that effectively extract information from a language model can also be used, out-of-the-box, to probe other language models for the same information. After confirming that discrete prompts induced with the AutoPrompt algorithm outperform manual and semi-manual prompts on the slot-filling task, we demonstrate a drop in performance for AutoPrompt prompts learned on a model and tested on another. We introduce a way to induce prompts by mixing language models at training time that results in prompts that generalize well across models. We conduct an extensive analysis of the induced prompts, finding that the more general prompts include a larger proportion of existing English words and have a less order-dependent and more uniform distribution of information across their component tokens. Our work provides preliminary evidence that it's possible…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
