Wide range screening of algorithmic bias in word embedding models using   large sentiment lexicons reveals underreported bias types

David Rozado

arXiv:1905.11985·cs.CY·July 1, 2020

Wide range screening of algorithmic bias in word embedding models using large sentiment lexicons reveals underreported bias types

David Rozado

PDF

TL;DR

This study conducts a large-scale analysis of sentiment biases in word embeddings across various social dimensions, revealing underreported bias types and highlighting the complexity and heterogeneity of algorithmic bias.

Contribution

It introduces a comprehensive screening method using large sentiment lexicons to identify diverse and underreported biases in popular word embedding models.

Findings

01

Systemic bias against African-American names in most models

02

Gender bias in embeddings is multifaceted and sometimes reversed

03

Novel biases against socioeconomic status, age, appearance, religion, and politics

Abstract

This work describes a large-scale analysis of sentiment associations in popular word embedding models along the lines of gender and ethnicity but also along the less frequently studied dimensions of socioeconomic status, age, sexual orientation, religious sentiment and political leanings. Consistent with previous scholarly literature, this work has found systemic bias against given names popular among African-Americans in most embedding models examined. Gender bias in embedding models however appears to be multifaceted and often reversed in polarity to what has been regularly reported. Interestingly, using the common operationalization of the term bias in the fairness literature, novel types of so far unreported bias types in word embedding models have also been identified. Specifically, the popular embedding models analyzed here display negative biases against middle and working-class…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.