Crowdsourcing a Word-Emotion Association Lexicon

Saif M. Mohammad; Peter D. Turney

arXiv:1308.6297·cs.CL·August 30, 2013·68 cites

Crowdsourcing a Word-Emotion Association Lexicon

Saif M. Mohammad, Peter D. Turney

PDF

Open Access

TL;DR

This paper demonstrates how crowdsourcing can efficiently create large, high-quality emotion and polarity lexicons by addressing annotation challenges and optimizing question formulation for better agreement.

Contribution

It introduces a novel crowdsourcing approach for building emotion lexicons, including strategies to improve annotation quality and sense-level data collection.

Findings

01

Crowdsourcing effectively generates large emotion lexicons.

02

Question formulation impacts inter-annotator agreement.

03

Inclusion of word choice questions improves data quality.

Abstract

Even though considerable attention has been given to the polarity of words (positive and negative) and the creation of large polarity lexicons, research in emotion analysis has had to rely on limited and small emotion lexicons. In this paper we show how the combined strength and wisdom of the crowds can be used to generate a large, high-quality, word-emotion and word-polarity association lexicon quickly and inexpensively. We enumerate the challenges in emotion annotation in a crowdsourcing scenario and propose solutions to address them. Most notably, in addition to questions about emotions associated with terms, we show how the inclusion of a word choice question can discourage malicious data entry, help identify instances where the annotator may not be familiar with the target term (allowing us to reject such annotations), and help obtain annotations at sense level (rather than at word…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSentiment Analysis and Opinion Mining · Spam and Phishing Detection · Complex Network Analysis Techniques