Assessing the quality and coherence of word embeddings after SCM-based intersectional bias mitigation

Eren Kocadag; Seyed Sahand Mohammadi Ziabari; Ali Mohammed Mansoor Alsahag

arXiv:2601.04393·cs.AI·January 9, 2026

Assessing the quality and coherence of word embeddings after SCM-based intersectional bias mitigation

Eren Kocadag, Seyed Sahand Mohammadi Ziabari, Ali Mohammed Mansoor Alsahag

PDF

Open Access

TL;DR

This paper evaluates intersectional bias mitigation in static word embeddings using SCM-based methods, analyzing their impact on semantic coherence and analogy tasks across different models.

Contribution

It extends SCM-based bias mitigation to intersectional identities and compares three debiasing strategies across multiple embedding models.

Findings

01

SCM-based mitigation effectively reduces intersectional bias.

02

Trade-off observed between neighborhood coherence and analogy performance.

03

Partial Projection offers a conservative and stable debiasing approach.

Abstract

Static word embeddings often absorb social biases from the text they learn from, and those biases can quietly shape downstream systems. Prior work that uses the Stereotype Content Model (SCM) has focused mostly on single-group bias along warmth and competence. We broaden that lens to intersectional bias by building compound representations for pairs of social identities through summation or concatenation, and by applying three debiasing strategies: Subtraction, Linear Projection, and Partial Projection. We study three widely used embedding families (Word2Vec, GloVe, and ConceptNet Numberbatch) and assess them with two complementary views of utility: whether local neighborhoods remain coherent and whether analogy behavior is preserved. Across models, SCM-based mitigation carries over well to the intersectional case and largely keeps the overall semantic landscape intact. The main cost is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods · Hate Speech and Cyberbullying Detection