Can discrete information extraction prompts generalize across language   models?

Nathana\"el Carraz Rakotonirina; Roberto Dess\`i; Fabio Petroni,; Sebastian Riedel; Marco Baroni

arXiv:2302.09865·cs.CL·March 8, 2023·1 cites

Can discrete information extraction prompts generalize across language models?

Nathana\"el Carraz Rakotonirina, Roberto Dess\`i, Fabio Petroni,, Sebastian Riedel, Marco Baroni

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates whether discrete prompts induced for one language model can generalize to others, proposing a training method that enhances cross-model transferability and analyzing properties of such prompts.

Contribution

It introduces a training approach mixing models to induce prompts that generalize across models and analyzes their linguistic and structural properties.

Findings

01

AutoPrompt prompts outperform manual prompts on slot-filling

02

Performance drops when prompts are transferred between models

03

Mixing models during training improves prompt generalization

Abstract

We study whether automatically-induced prompts that effectively extract information from a language model can also be used, out-of-the-box, to probe other language models for the same information. After confirming that discrete prompts induced with the AutoPrompt algorithm outperform manual and semi-manual prompts on the slot-filling task, we demonstrate a drop in performance for AutoPrompt prompts learned on a model and tested on another. We introduce a way to induce prompts by mixing language models at training time that results in prompts that generalize well across models. We conduct an extensive analysis of the induced prompts, finding that the more general prompts include a larger proportion of existing English words and have a less order-dependent and more uniform distribution of information across their component tokens. Our work provides preliminary evidence that it's possible…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ncarraz/prompt_generalization
noneOfficial

Videos

Can discrete information extraction prompts generalize across language models?· slideslive

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification