A Survey on Gender Bias in Natural Language Processing

Karolina Stanczak; Isabelle Augenstein

arXiv:2112.14168·cs.CL·December 30, 2021·58 cites

A Survey on Gender Bias in Natural Language Processing

Karolina Stanczak, Isabelle Augenstein

PDF

Open Access

TL;DR

This survey reviews 304 papers on gender bias in NLP, highlighting key limitations such as binary gender treatment, language scope, and methodological flaws, and offers recommendations for future research improvements.

Contribution

It provides a comprehensive analysis of existing gender bias research in NLP, identifying core limitations and proposing future directions to address them.

Findings

01

Most research treats gender as binary, ignoring fluidity.

02

Research is concentrated on English and high-resource languages.

03

Many algorithms do not test for bias or consider ethical implications.

Abstract

Language can be used as a means of reproducing and enforcing harmful stereotypes and biases and has been analysed as such in numerous research. In this paper, we present a survey of 304 papers on gender bias in natural language processing. We analyse definitions of gender and its categories within social sciences and connect them to formal definitions of gender bias in NLP research. We survey lexica and datasets applied in research on gender bias and then compare and contrast approaches to detecting and mitigating gender bias. We find that research on gender bias suffers from four core limitations. 1) Most research treats gender as a binary variable neglecting its fluidity and continuity. 2) Most of the work has been conducted in monolingual setups for English or other high-resource languages. 3) Despite a myriad of papers on gender bias in NLP methods, we find that most of the newly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Gender Studies in Language