Enabling Natural Zero-Shot Prompting on Encoder Models via Statement-Tuning
Ahmed Elshabrawy, Yongxin Huang, Iryna Gurevych, Alham Fikri Aji

TL;DR
This paper introduces Statement-Tuning, a method that enables smaller encoder models to perform zero-shot and few-shot tasks by training them to discriminate between finite statements, achieving competitive results with fewer parameters.
Contribution
It presents Statement-Tuning, a novel approach that allows encoder models to generalize to zero-shot and few-shot tasks by modeling discriminative tasks as statement sets.
Findings
Statement-Tuning achieves competitive performance with fewer parameters.
Task and statement diversity improve zero-shot generalization.
Strong performance with modest training data.
Abstract
While Large Language Models (LLMs) exhibit remarkable capabilities in zero-shot and few-shot scenarios, they often require computationally prohibitive sizes. Conversely, smaller Masked Language Models (MLMs) like BERT and RoBERTa achieve state-of-the-art results through fine-tuning but struggle with extending to few-shot and zero-shot settings due to their architectural constraints. Hence, we propose Statement-Tuning, a technique that models discriminative tasks as a set of finite statements and trains an encoder model to discriminate between the potential statements to determine the label. We do Statement-Tuning on multiple tasks to enable cross-task generalization. Experimental results demonstrate that Statement-Tuning achieves competitive performance compared to state-of-the-art LLMs with significantly fewer parameters. Moreover, the study investigates the impact of several design…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Smart Grid Security and Resilience
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Sparse Evolutionary Training · Weight Decay · Dense Connections · Residual Connection · Softmax · Adam · Linear Warmup With Linear Decay · Layer Normalization
