Efficient Few-Shot Learning Without Prompts
Lewis Tunstall, Nils Reimers, Unso Eun Seo Jo, Luke Bates, Daniel, Korat, Moshe Wasserblat, Oren Pereg

TL;DR
SetFit introduces a prompt-free, efficient method for few-shot learning using Sentence Transformers, achieving high accuracy with less training time and fewer parameters, suitable for multilingual applications.
Contribution
The paper presents SetFit, a novel contrastive Siamese fine-tuning approach that eliminates prompts and verbalizers, reducing complexity and training time in few-shot learning.
Findings
SetFit achieves comparable accuracy to PEFT and PET methods.
SetFit is significantly faster to train than existing techniques.
SetFit performs well in multilingual settings by switching the underlying Sentence Transformer.
Abstract
Recent few-shot methods, such as parameter-efficient fine-tuning (PEFT) and pattern exploiting training (PET), have achieved impressive results in label-scarce settings. However, they are difficult to employ since they are subject to high variability from manually crafted prompts, and typically require billion-parameter language models to achieve high accuracy. To address these shortcomings, we propose SetFit (Sentence Transformer Fine-tuning), an efficient and prompt-free framework for few-shot fine-tuning of Sentence Transformers (ST). SetFit works by first fine-tuning a pretrained ST on a small number of text pairs, in a contrastive Siamese manner. The resulting model is then used to generate rich text embeddings, which are used to train a classification head. This simple framework requires no prompts or verbalizers, and achieves high accuracy with orders of magnitude less parameters…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗amazon/sm-hackathon-actionability-9-multi-outputs-setfit-all-roberta-large-model-v0.1model· 2 dl· ♡ 32 dl♡ 3
- 🤗uaritm/lik_neuro_202model· 14 dl· ♡ 114 dl♡ 1
- 🤗philschmid/setfit-ag-news-endpointmodel· 10 dl· ♡ 810 dl♡ 8
- 🤗lewtun/setfit-new-model-cardmodel· 2 dl2 dl
- 🤗lewispons/Email-classifier-v2model· 4 dl· ♡ 24 dl♡ 2
- 🤗fathyshalab/clinic-credit_cards-robertamodel· 1 dl1 dl
- 🤗fathyshalab/massive-robertamodel· 1 dl1 dl
- 🤗gayatrividhate/sentiment_analysis_SetFitmodel· 3 dl3 dl
- 🤗fathyshalab/clinic-kitchen_and_dining-roberta-domain-adaptationmodel· 2 dl2 dl
- 🤗fathyshalab/clinic-kitchen_and_dining-robertamodel· 1 dl1 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Adam · Softmax · Dropout · Label Smoothing
