The Undesirable Dependence on Frequency of Gender Bias Metrics Based on Word Embeddings
Francisco Valentini, Germ\'an Rosati, Diego Fernandez Slezak, Edgar, Altszyler

TL;DR
This paper investigates how frequency influences gender bias metrics derived from word embeddings, revealing that common metrics are biased by word frequency rather than true societal biases.
Contribution
The study demonstrates that popular bias metrics based on word embeddings are affected by word frequency, and proposes that alternative measures like PMI are less biased by frequency effects.
Findings
Skip-gram and GloVe detect bias influenced by word frequency
Frequency effects persist even with randomly shuffled words
PMI-based metric shows less dependence on word frequency
Abstract
Numerous works use word embedding-based metrics to quantify societal biases and stereotypes in texts. Recent studies have found that word embeddings can capture semantic similarity but may be affected by word frequency. In this work we study the effect of frequency when measuring female vs. male gender bias with word embedding-based bias quantification methods. We find that Skip-gram with negative sampling and GloVe tend to detect male bias in high frequency words, while GloVe tends to return female bias in low frequency words. We show these behaviors still exist when words are randomly shuffled. This proves that the frequency-based effect observed in unshuffled corpora stems from properties of the metric rather than from word associations. The effect is spurious and problematic since bias metrics should depend exclusively on word co-occurrences and not individual word frequencies.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuthorship Attribution and Profiling · Hate Speech and Cyberbullying Detection · Media Influence and Politics
MethodsGloVe Embeddings
