Optimizing text representations to capture (dis)similarity between political parties
Tanise Ceron, Nico Blokker, Sebastian Pad\'o

TL;DR
This paper investigates how to optimize text representations for modeling political party similarities, comparing informed approaches with heuristic methods, and finds that heuristics can reliably predict similarities without manual annotation.
Contribution
It introduces heuristic-based methods for text representation that effectively capture political party similarities without requiring manual annotations.
Findings
Heuristics based on within-party similarity outperform other methods.
Normalization steps improve the robustness of similarity predictions.
Manual annotations are not necessary for reliable party similarity modeling.
Abstract
Even though fine-tuned neural language models have been pivotal in enabling "deep" automatic text analysis, optimizing text representations for specific applications remains a crucial bottleneck. In this study, we look at this problem in the context of a task from computational social science, namely modeling pairwise similarities between political parties. Our research question is what level of structural information is necessary to create robust text representation, contrasting a strongly informed approach (which uses both claim span and claim category annotations) with approaches that forgo one or both types of annotation with document structure-based heuristics. Evaluating our models on the manifestos of German parties for the 2021 federal election. We find that heuristics that maximize within-party over between-party similarity along with a normalization step lead to reliable party…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods
