A Universal Discriminator for Zero-Shot Generalization

Haike Xu; Zongyu Lin; Jing Zhou; Yanan Zheng; Zhilin Yang

arXiv:2211.08099·cs.CL·June 7, 2023

A Universal Discriminator for Zero-Shot Generalization

Haike Xu, Zongyu Lin, Jing Zhou, Yanan Zheng, Zhilin Yang

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that discriminative models, trained as universal discriminators, outperform generative models in zero-shot NLP tasks, achieving state-of-the-art results with fewer parameters and minimal prompting.

Contribution

It introduces a simple discriminative approach trained as a universal discriminator that surpasses generative models in zero-shot and fine-tuning NLP tasks, with improved robustness and efficiency.

Findings

01

State-of-the-art zero-shot results on T0 benchmark

02

Outperforms T0 by 16.0%, 7.8%, and 11.5% on different scales

03

Achieves new SOTA with only 1/4 parameters of previous methods

Abstract

Generative modeling has been the dominant approach for large-scale pretraining and zero-shot generalization. In this work, we challenge this convention by showing that discriminative approaches perform substantially better than generative ones on a large number of NLP tasks. Technically, we train a single discriminator to predict whether a text sample comes from the true data distribution, similar to GANs. Since many NLP tasks can be formulated as selecting from a few options, we use this discriminator to predict the concatenation of input and which option has the highest probability of coming from the true data distribution. This simple formulation achieves state-of-the-art zero-shot results on the T0 benchmark, outperforming T0 by 16.0\%, 7.8\%, and 11.5\% respectively on different scales. In the finetuning setting, our approach also achieves new state-of-the-art results on a wide…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rafa-zy/ud
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications