Manual Verbalizer Enrichment for Few-Shot Text Classification

Quang Anh Nguyen; Nadi Tomeh; Mustapha Lebbah; Thierry Charnois,; Hanene Azzag; Santiago Cordoba Mu\~noz

arXiv:2410.06173·cs.CL·October 15, 2024

Manual Verbalizer Enrichment for Few-Shot Text Classification

Quang Anh Nguyen, Nadi Tomeh, Mustapha Lebbah, Thierry Charnois,, Hanene Azzag, Santiago Cordoba Mu\~noz

PDF

Open Access

TL;DR

This paper introduces MAVE, a method for enriching verbalizers in prompt-based few-shot text classification by leveraging word embeddings, achieving state-of-the-art results with minimal supervision.

Contribution

Proposes MAVE, a novel verbalizer enrichment technique using embedding neighborhood relations, and establishes a benchmarking procedure for few-shot document classification.

Findings

01

MAVE outperforms existing verbalizers in few-shot settings.

02

The approach is especially effective with very limited data.

03

Achieves state-of-the-art results with fewer resources.

Abstract

With the continuous development of pre-trained language models, prompt-based training becomes a well-adopted paradigm that drastically improves the exploitation of models for many natural language processing tasks. Prompting also shows great performance compared to traditional fine-tuning when adapted to zero-shot or few-shot scenarios where the number of annotated data is limited. In this framework, the role of verbalizers is essential, as an interpretation from masked word distributions into output predictions. In this work, we propose \acrshort{mave}, an approach for verbalizer construction by enrichment of class labels using neighborhood relation in the embedding space of words for the text classification task. In addition, we elaborate a benchmarking procedure to evaluate typical baselines of verbalizers for document classification in few-shot learning contexts. Our model achieves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis