Zero-shot prompt-based classification: topic labeling in times of   foundation models in German Tweets

Simon M\"unker; Kai Kugler; Achim Rettinger

arXiv:2406.18239·cs.CL·June 27, 2024·1 cites

Zero-shot prompt-based classification: topic labeling in times of foundation models in German Tweets

Simon M\"unker, Kai Kugler, Achim Rettinger

PDF

Open Access 1 Video

TL;DR

This paper evaluates the effectiveness of zero-shot prompt-based classification for labeling German Tweets about European crises, demonstrating it performs comparably to fine-tuned models without requiring training data.

Contribution

It empirically assesses prompt-based classification in a real-world setting, highlighting its potential as a training-free alternative to traditional supervised methods.

Findings

01

Prompt-based approach is comparable to fine-tuned BERT in accuracy.

02

No annotated training data needed for prompt-based classification.

03

Supports paradigm shift towards training-free NLP tasks.

Abstract

Filtering and annotating textual data are routine tasks in many areas, like social media or news analytics. Automating these tasks allows to scale the analyses wrt. speed and breadth of content covered and decreases the manual effort required. Due to technical advancements in Natural Language Processing, specifically the success of large foundation models, a new tool for automating such annotation processes by using a text-to-text interface given written guidelines without providing training samples has become available. In this work, we assess these advancements in-the-wild by empirically testing them in an annotation task on German Twitter data about social and political European crises. We compare the prompt-based results with our human annotation and preceding classification approaches, including Naive Bayes and a BERT-based fine-tuning/domain adaptation pipeline. Our results show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Zero-shot prompt-based classification: topic labeling in times of foundation models in German Tweets· underline

Taxonomy

TopicsNatural Language Processing Techniques · Authorship Attribution and Profiling · Speech Recognition and Synthesis

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · WordPiece · Residual Connection · Weight Decay · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Softmax · Layer Normalization · Attention Dropout · Linear Warmup With Linear Decay · Dropout