PADA: Example-based Prompt Learning for on-the-fly Adaptation to Unseen Domains
Eyal Ben-David, Nadav Oved, Roi Reichart

TL;DR
PADA introduces an example-based prompt learning method that enables on-the-fly adaptation to unseen domains in NLP tasks without prior target domain data, significantly improving performance over baselines.
Contribution
The paper proposes PADA, a novel prompt learning approach that generates domain-specific prompts for unseen domains, facilitating effective zero-shot domain adaptation in NLP.
Findings
PADA outperforms strong baselines in 14 multi-source adaptation scenarios.
It effectively generates domain-related prompts that improve task performance.
The method works across text classification and sequence tagging tasks.
Abstract
Natural Language Processing algorithms have made incredible progress, but they still struggle when applied to out-of-distribution examples. We address a challenging and underexplored version of this domain adaptation problem, where an algorithm is trained on several source domains, and then applied to examples from unseen domains that are unknown at training time. Particularly, no examples, labeled or unlabeled, or any other knowledge about the target domain are available to the algorithm at training time. We present PADA: An example-based autoregressive Prompt learning algorithm for on-the-fly Any-Domain Adaptation, based on the T5 language model. Given a test example, PADA first generates a unique prompt for it and then, conditioned on this prompt, labels the example with respect to the NLP prediction task. PADA is trained to generate a prompt which is a token sequence of unrestricted…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Multimodal Machine Learning Applications
MethodsLinear Layer · Byte Pair Encoding · Dense Connections · Dropout · Attention Is All You Need · Gated Linear Unit · Layer Normalization · Softmax · Multi-Head Attention · SentencePiece
