
TL;DR
This thesis advances German Twitter sentiment analysis by creating a new annotated corpus and developing novel, outperforming methods across lexicon generation, fine-grained opinion mining, message polarity classification, and discourse-aware analysis.
Contribution
It introduces a new German Twitter sentiment corpus and proposes innovative methods that surpass existing techniques in multiple sentiment analysis tasks.
Findings
Dictionary-based lexicons outperform corpus- and embedding-based ones.
The linear projection algorithm improves lexicon quality.
CRF models outperform RNNs in source and target prediction.
Abstract
This thesis explores the ways by how people express their opinions on German Twitter, examines current approaches to automatic mining of these feelings, and proposes novel methods, which outperform state-of-the-art techniques. For this purpose, I introduce a new corpus of German tweets that have been manually annotated with sentiments, their targets and holders, as well as polar terms and their contextual modifiers. Using these data, I explore four major areas of sentiment research: (i) generation of sentiment lexicons, (ii) fine-grained opinion mining, (iii) message-level polarity classification, and (iv) discourse-aware sentiment analysis. In the first task, I compare three popular groups of lexicon generation methods: dictionary-, corpus-, and word-embedding-based ones, finding that dictionary-based systems generally yield better lexicons than the last two groups. Apart from this, I…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsConditional Random Field
