Political Advertising Dataset: the use case of the Polish 2020 Presidential Elections
{\L}ukasz Augustyniak, Krzysztof Rajda, Tomasz Kajdanowicz, Micha{\l}, Bernaczyk

TL;DR
This paper introduces the first publicly available dataset of Polish political tweets, annotated for campaign categories, and demonstrates its use in training a neural tagger to analyze political advertising during the 2020 presidential elections.
Contribution
It provides a novel annotated dataset for Polish political ads and applies neural models for categorization, enabling new research in political social media analysis.
Findings
Achieved 0.65 inter-annotator agreement (Cohen's kappa)
Neural tagger reached 70% F1 score on the dataset
Initial analysis of Polish 2020 Presidential Elections on Twitter
Abstract
Political campaigns are full of political ads posted by candidates on social media. Political advertisements constitute a basic form of campaigning, subjected to various social requirements. We present the first publicly open dataset for detecting specific text chunks and categories of political advertising in the Polish language. It contains 1,705 human-annotated tweets tagged with nine categories, which constitute campaigning under Polish electoral law. We achieved a 0.65 inter-annotator agreement (Cohen's kappa score). An additional annotator resolved the mismatches between the first two annotators improving the consistency and complexity of the annotation process. We used the newly created dataset to train a well established neural tagger (achieving a 70% percent points F1 score). We also present a possible direction of use cases for such datasets and models with an initial analysis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Authorship Attribution and Profiling · Misinformation and Its Impacts
