Zero-shot prompt-based classification: topic labeling in times of foundation models in German Tweets
Simon M\"unker, Kai Kugler, Achim Rettinger

TL;DR
This paper evaluates the effectiveness of zero-shot prompt-based classification for labeling German Tweets about European crises, demonstrating it performs comparably to fine-tuned models without requiring training data.
Contribution
It empirically assesses prompt-based classification in a real-world setting, highlighting its potential as a training-free alternative to traditional supervised methods.
Findings
Prompt-based approach is comparable to fine-tuned BERT in accuracy.
No annotated training data needed for prompt-based classification.
Supports paradigm shift towards training-free NLP tasks.
Abstract
Filtering and annotating textual data are routine tasks in many areas, like social media or news analytics. Automating these tasks allows to scale the analyses wrt. speed and breadth of content covered and decreases the manual effort required. Due to technical advancements in Natural Language Processing, specifically the success of large foundation models, a new tool for automating such annotation processes by using a text-to-text interface given written guidelines without providing training samples has become available. In this work, we assess these advancements in-the-wild by empirically testing them in an annotation task on German Twitter data about social and political European crises. We compare the prompt-based results with our human annotation and preceding classification approaches, including Naive Bayes and a BERT-based fine-tuning/domain adaptation pipeline. Our results show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Authorship Attribution and Profiling · Speech Recognition and Synthesis
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · WordPiece · Residual Connection · Weight Decay · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Softmax · Layer Normalization · Attention Dropout · Linear Warmup With Linear Decay · Dropout
