Classifying multilingual party manifestos: Domain transfer across country, time, and genre
Matthias A{\ss}enmacher, Nadja Sauter, Christian Heumann

TL;DR
This study evaluates the effectiveness of domain transfer using transformer models for classifying multilingual political manifestos across different countries, genres, and time periods, highlighting the models' robustness and transferability.
Contribution
It demonstrates the strong within-domain performance of fine-tuned transformer models and assesses their robustness across various dimensions such as country, language, genre, and time.
Findings
BERT achieves the highest initial classification scores.
DistilBERT offers a competitive, computationally efficient alternative.
Models can be applied to future data with similar performance.
Abstract
Annotating costs of large corpora are still one of the main bottlenecks in empirical social science research. On the one hand, making use of the capabilities of domain transfer allows re-using annotated data sets and trained models. On the other hand, it is not clear how well domain transfer works and how reliable the results are for transfer across different dimensions. We explore the potential of domain transfer across geographical locations, languages, time, and genre in a large-scale database of political manifestos. First, we show the strong within-domain classification performance of fine-tuned transformer models. Second, we vary the genre of the test set across the aforementioned dimensions to test for the fine-tuned models' robustness and transferability. For switching genres, we use an external corpus of transcribed speeches from New Zealand politicians while for the other…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Social Media and Politics · Linguistic Variation and Morphology
MethodsMulti-Head Attention · Attention Is All You Need · Refunds@Expedia|||How do I get a full refund from Expedia? · Layer Normalization · Linear Layer · Dropout · WordPiece · Adam · Attention Dropout · Linear Warmup With Linear Decay
