Word Embeddings via Causal Inference: Gender Bias Reducing and Semantic   Information Preserving

Lei Ding; Dengdeng Yu; Jinhan Xie; Wenxing Guo; Shenggang Hu; Meichen; Liu; Linglong Kong; Hongsheng Dai; Yanchun Bao; Bei Jiang

arXiv:2112.05194·cs.CL·December 13, 2021

Word Embeddings via Causal Inference: Gender Bias Reducing and Semantic Information Preserving

Lei Ding, Dengdeng Yu, Jinhan Xie, Wenxing Guo, Shenggang Hu, Meichen, Liu, Linglong Kong, Hongsheng Dai, Yanchun Bao, Bei Jiang

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a causal inference-based method to reduce gender bias in word embeddings while preserving semantic information, improving fairness and performance in NLP tasks.

Contribution

It presents a novel causal inference framework for debiasing word embeddings that maintains semantic integrity, outperforming existing methods.

Findings

01

State-of-the-art gender bias reduction

02

Improved word similarity performance

03

Enhanced downstream NLP task results

Abstract

With widening deployments of natural language processing (NLP) in daily life, inherited social biases from NLP models have become more severe and problematic. Previous studies have shown that word embeddings trained on human-generated corpora have strong gender biases that can produce discriminative results in downstream tasks. Previous debiasing methods focus mainly on modeling bias and only implicitly consider semantic information while completely overlooking the complex underlying causal structure among bias and semantic components. To address these issues, we propose a novel methodology that leverages a causal inference framework to effectively remove gender bias. The proposed method allows us to construct and analyze the complex causal mechanisms facilitating gender information flow while retaining oracle semantic information within word embeddings. Our comprehensive experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Lei-Ding07/Word_Debias_DeSIP
pytorchOfficial

Videos

Word Embeddings via Causal Inference: Gender Bias Reducing and Semantic Information Preserving· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification