Generating Sentiment Lexicons for German Twitter
Uladzimir Sidarenka, Manfred Stede

TL;DR
This paper compares different sentiment lexicon generation methods for German Twitter data, showing that semi-automatic translations outperform automatic methods and dictionary-based approaches are more effective than corpus-based ones.
Contribution
It systematically evaluates and compares lexicon generation approaches for German social media, highlighting the effectiveness of semi-automatic translation and dictionary-based methods.
Findings
Semi-automatic translations outperform automatic SLG methods (F1-score 0.589).
Dictionary-based techniques yield better polarity lists than corpus-based approaches.
Dictionary-based methods achieve higher F1-scores (up to 0.479) compared to corpus-based methods (up to 0.419).
Abstract
Despite a substantial progress made in developing new sentiment lexicon generation (SLG) methods for English, the task of transferring these approaches to other languages and domains in a sound way still remains open. In this paper, we contribute to the solution of this problem by systematically comparing semi-automatic translations of common English polarity lists with the results of the original automatic SLG algorithms, which were applied directly to German data. We evaluate these lexicons on a corpus of 7,992 manually annotated tweets. In addition to that, we also collate the results of dictionary- and corpus-based SLG methods in order to find out which of these paradigms is better suited for the inherently noisy domain of social media. Our experiments show that semi-automatic translations notably outperform automatic systems (reaching a macro-averaged F1-score of 0.589), and that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Advanced Text Analysis Techniques · Topic Modeling
