Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems
Svetlana Kiritchenko, Saif M. Mohammad

TL;DR
This paper introduces the Equity Evaluation Corpus (EEC) to assess racial and gender biases in 219 sentiment analysis systems, revealing significant biases in many systems and providing a new benchmark for bias evaluation.
Contribution
The paper presents the first benchmark dataset, EEC, for systematically evaluating racial and gender biases in sentiment analysis systems.
Findings
Many systems exhibit statistically significant bias towards certain races and genders.
Biases manifest as higher sentiment scores for specific groups.
The EEC dataset is publicly available for future bias assessments.
Abstract
Automatic machine learning systems can inadvertently accentuate and perpetuate inappropriate human biases. Past work on examining inappropriate biases has largely focused on just individual systems. Further, there is no benchmark dataset for examining inappropriate biases in systems. Here for the first time, we present the Equity Evaluation Corpus (EEC), which consists of 8,640 English sentences carefully chosen to tease out biases towards certain races and genders. We use the dataset to examine 219 automatic sentiment analysis systems that took part in a recent shared task, SemEval-2018 Task 1 'Affect in Tweets'. We find that several of the systems show statistically significant bias; that is, they consistently provide slightly higher sentiment intensity predictions for one race or one gender. We make the EEC freely available.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Sentiment Analysis and Opinion Mining · Misinformation and Its Impacts
