Efficient Few-Shot Learning Without Prompts

Lewis Tunstall; Nils Reimers; Unso Eun Seo Jo; Luke Bates; Daniel; Korat; Moshe Wasserblat; Oren Pereg

arXiv:2209.11055·cs.CL·September 23, 2022·101 cites

Efficient Few-Shot Learning Without Prompts

Lewis Tunstall, Nils Reimers, Unso Eun Seo Jo, Luke Bates, Daniel, Korat, Moshe Wasserblat, Oren Pereg

PDF

Open Access 1 Repo 10 Models 1 Datasets

TL;DR

SetFit introduces a prompt-free, efficient method for few-shot learning using Sentence Transformers, achieving high accuracy with less training time and fewer parameters, suitable for multilingual applications.

Contribution

The paper presents SetFit, a novel contrastive Siamese fine-tuning approach that eliminates prompts and verbalizers, reducing complexity and training time in few-shot learning.

Findings

01

SetFit achieves comparable accuracy to PEFT and PET methods.

02

SetFit is significantly faster to train than existing techniques.

03

SetFit performs well in multilingual settings by switching the underlying Sentence Transformer.

Abstract

Recent few-shot methods, such as parameter-efficient fine-tuning (PEFT) and pattern exploiting training (PET), have achieved impressive results in label-scarce settings. However, they are difficult to employ since they are subject to high variability from manually crafted prompts, and typically require billion-parameter language models to achieve high accuracy. To address these shortcomings, we propose SetFit (Sentence Transformer Fine-tuning), an efficient and prompt-free framework for few-shot fine-tuning of Sentence Transformers (ST). SetFit works by first fine-tuning a pretrained ST on a small number of text pairs, in a contrastive Siamese manner. The resulting model is then used to generate rich text embeddings, which are used to train a classification head. This simple framework requires no prompts or verbalizers, and achieves high accuracy with orders of magnitude less parameters…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

huggingface/setfit
pytorchOfficial

Models

Datasets

kardosdrur/dawiki_categories
dataset· 24 dl
24 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Adam · Softmax · Dropout · Label Smoothing