Bias in Text Embedding Models

Vasyl Rakivnenko; Nestor Maslej; Jessica Cervi; Volodymyr Zhukov

arXiv:2406.12138·cs.AI·June 19, 2024·1 cites

Bias in Text Embedding Models

Vasyl Rakivnenko, Nestor Maslej, Jessica Cervi, Volodymyr Zhukov

PDF

Open Access

TL;DR

This paper investigates gender bias in popular text embedding models, revealing that these models often associate certain professions with gendered terms, with variations across models and prompts, highlighting the need for awareness in business applications.

Contribution

It provides an empirical analysis of gender bias in text embedding models, showing how biases vary across models and professions, and emphasizes the importance of addressing this bias in practical use.

Findings

01

Models associate nursing and socialite with female terms

02

Models link CEO and boss with male terms

03

Bias magnitude varies across models and prompts

Abstract

Text embedding is becoming an increasingly popular AI methodology, especially among businesses, yet the potential of text embedding models to be biased is not well understood. This paper examines the degree to which a selection of popular text embedding models are biased, particularly along gendered dimensions. More specifically, this paper studies the degree to which these models associate a list of given professions with gendered terms. The analysis reveals that text embedding models are prone to gendered biases but in varying ways. Although there are certain inter-model commonalities, for instance, greater association of professions like nurse, homemaker, and socialite with female identifiers, and greater association of professions like CEO, manager, and boss with male identifiers, not all models make the same gendered associations for each occupation. Furthermore, the magnitude and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods