Hotter and Colder: A New Approach to Annotating Sentiment, Emotions, and Bias in Icelandic Blog Comments
Steinunn Rut Fri{\dh}riksd\'ottir, Dan Saattrup Nielsen, and Hafsteinn, Einarsson

TL;DR
Hotter and Colder introduces a comprehensive Icelandic online comment dataset annotated for sentiment, emotions, and bias, combining automated and manual labeling to support research in content moderation and harmful behavior detection.
Contribution
The paper presents a new large-scale annotated dataset for Icelandic online comments, integrating GPT-4o mini automation with manual review to improve annotation quality.
Findings
Created 12,232 manually verified annotations
Annotated 800,000 comments across 25 tasks
Enhanced dataset for content moderation research
Abstract
This paper presents Hotter and Colder, a dataset designed to analyze various types of online behavior in Icelandic blog comments. Building on previous work, we used GPT-4o mini to annotate approximately 800,000 comments for 25 tasks, including sentiment analysis, emotion detection, hate speech, and group generalizations. Each comment was automatically labeled on a 5-point Likert scale. In a second annotation stage, comments with high or low probabilities of containing each examined behavior were subjected to manual revision. By leveraging crowdworkers to refine these automatically labeled comments, we ensure the quality and accuracy of our dataset resulting in 12,232 uniquely annotated comments and 19,301 annotations. Hotter and Colder provides an essential resource for advancing research in content moderation and automatically detectiong harmful online behaviors in Icelandic.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining
