Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective
Ping Yang, Junjie Wang, Ruyi Gan, Xinyu Zhu, Lin Zhang, Ziwei Wu,, Xinyu Gao, Jiaxing Zhang, Tetsuya Sakai

TL;DR
This paper introduces a format-agnostic zero-shot learning approach that converts tasks into multiple-choice problems, achieving state-of-the-art results with fewer parameters across various NLP tasks.
Contribution
It presents a novel multiple-choice conversion paradigm for zero-shot learning, enhancing generalization and reducing model size compared to existing large-scale models.
Findings
State-of-the-art performance on several benchmarks.
Effective on tasks like natural language inference and text classification.
Achieves high accuracy with only 235M parameters.
Abstract
We propose a new paradigm for zero-shot learners that is format agnostic, i.e., it is compatible with any format and applicable to a list of language tasks, such as text classification, commonsense reasoning, coreference resolution, and sentiment analysis. Zero-shot learning aims to train a model on a given task such that it can address new learning tasks without any additional training. Our approach converts zero-shot learning into multiple-choice tasks, avoiding problems in commonly used large-scale generative models such as FLAN. It not only adds generalization ability to models but also significantly reduces the number of parameters. Our method shares the merits of efficient training and deployment. Our approach shows state-of-the-art performance on several benchmarks and produces satisfactory results on tasks such as natural language inference and text classification. Our model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗IDEA-CCNL/Erlangshen-UniMC-Albert-235M-Englishmodel· 8 dl· ♡ 28 dl♡ 2
- 🤗IDEA-CCNL/Erlangshen-UniMC-RoBERTa-110M-Chinesemodel· 5 dl· ♡ 65 dl♡ 6
- 🤗IDEA-CCNL/Erlangshen-UniMC-RoBERTa-330M-Chinesemodel· 20 dl· ♡ 320 dl♡ 3
- 🤗IDEA-CCNL/Erlangshen-UniMC-MegatronBERT-1.3B-Chinesemodel· 31 dl· ♡ 831 dl♡ 8
- 🤗IDEA-CCNL/Randeng-BART-139M-QG-Chinesemodel· 3 dl· ♡ 63 dl♡ 6
- 🤗IDEA-CCNL/Erlangshen-UniMC-DeBERTa-v2-110M-Chinesemodel· 29 dl· ♡ 329 dl♡ 3
- 🤗IDEA-CCNL/Erlangshen-UniMC-DeBERTa-v2-330M-Chinesemodel· 14 dl· ♡ 214 dl♡ 2
- 🤗IDEA-CCNL/Erlangshen-UniMC-DeBERTa-v2-1.4B-Chinesemodel· 2 dl· ♡ 42 dl♡ 4
- 🤗IDEA-CCNL/Ziya-LLaMA-13B-v1model· 1.0k dl· ♡ 2761.0k dl♡ 276
- 🤗IDEA-CCNL/Ziya-BLIP2-14B-Visual-v1model· 17 dl· ♡ 5817 dl♡ 58
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
