Enabling Natural Zero-Shot Prompting on Encoder Models via   Statement-Tuning

Ahmed Elshabrawy; Yongxin Huang; Iryna Gurevych; Alham Fikri Aji

arXiv:2404.12897·cs.CL·October 18, 2024

Enabling Natural Zero-Shot Prompting on Encoder Models via Statement-Tuning

Ahmed Elshabrawy, Yongxin Huang, Iryna Gurevych, Alham Fikri Aji

PDF

Open Access 1 Video

TL;DR

This paper introduces Statement-Tuning, a method that enables smaller encoder models to perform zero-shot and few-shot tasks by training them to discriminate between finite statements, achieving competitive results with fewer parameters.

Contribution

It presents Statement-Tuning, a novel approach that allows encoder models to generalize to zero-shot and few-shot tasks by modeling discriminative tasks as statement sets.

Findings

01

Statement-Tuning achieves competitive performance with fewer parameters.

02

Task and statement diversity improve zero-shot generalization.

03

Strong performance with modest training data.

Abstract

While Large Language Models (LLMs) exhibit remarkable capabilities in zero-shot and few-shot scenarios, they often require computationally prohibitive sizes. Conversely, smaller Masked Language Models (MLMs) like BERT and RoBERTa achieve state-of-the-art results through fine-tuning but struggle with extending to few-shot and zero-shot settings due to their architectural constraints. Hence, we propose Statement-Tuning, a technique that models discriminative tasks as a set of finite statements and trains an encoder model to discriminate between the potential statements to determine the label. We do Statement-Tuning on multiple tasks to enable cross-task generalization. Experimental results demonstrate that Statement-Tuning achieves competitive performance compared to state-of-the-art LLMs with significantly fewer parameters. Moreover, the study investigates the impact of several design…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Enabling Natural Zero-Shot Prompting on Encoder Models via Statement-Tuning· underline

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Smart Grid Security and Resilience

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Sparse Evolutionary Training · Weight Decay · Dense Connections · Residual Connection · Softmax · Adam · Linear Warmup With Linear Decay · Layer Normalization