A Corpus for Sentence-level Subjectivity Detection on English News Articles
Francesco Antici, Andrea Galassi, Federico Ruggeri, Katerina Korre,, Arianna Muti, Alessandra Bardi, Alice Fedotova, Alberto Barr\'on-Cede\~no

TL;DR
This paper introduces a new corpus for sentence-level subjectivity detection in English news articles, along with annotation guidelines that are language-agnostic, and evaluates multilingual transformer models on the task.
Contribution
It presents a novel, language-independent annotation scheme and a new corpus, enabling subjectivity detection without language-specific tools, and assesses multilingual models' effectiveness.
Findings
Multilingual models outperform monolingual models in subjectivity detection.
The corpus facilitates cross-lingual subjectivity analysis.
Models trained on multilingual data achieve the best performance.
Abstract
We develop novel annotation guidelines for sentence-level subjectivity detection, which are not limited to language-specific cues. We use our guidelines to collect NewsSD-ENG, a corpus of 638 objective and 411 subjective sentences extracted from English news articles on controversial topics. Our corpus paves the way for subjectivity detection in English and across other languages without relying on language-specific tools, such as lexicons or machine translation. We evaluate state-of-the-art multilingual transformer-based models on the task in mono-, multi-, and cross-language settings. For this purpose, we re-annotate an existing Italian corpus. We observe that models trained in the multilingual setting achieve the best performance on the task.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Sentiment Analysis and Opinion Mining · Advanced Text Analysis Techniques
