Gender Inequality in English Textbooks Around the World: an NLP Approach

Tairan Liu

arXiv:2506.02425·cs.CL·June 4, 2025

Gender Inequality in English Textbooks Around the World: an NLP Approach

Tairan Liu

PDF

Open Access

TL;DR

This study uses NLP techniques to quantify and compare gender inequality in English textbooks from 22 countries, revealing consistent male overrepresentation across diverse cultural contexts.

Contribution

It introduces a cross-cultural NLP framework to measure gender bias in textbooks, combining multiple metrics and analyzing large language models' ability to detect gendered language.

Findings

01

Male characters are overrepresented in count, firstness, and named entities.

02

Gender inequality exists in all regions studied, with Latin sphere showing least disparity.

03

NLP methods can effectively quantify and analyze gender bias in educational texts.

Abstract

Textbooks play a critical role in shaping children's understanding of the world. While previous studies have identified gender inequality in individual countries' textbooks, few have examined the issue cross-culturally. This study applies natural language processing methods to quantify gender inequality in English textbooks from 22 countries across 7 cultural spheres. Metrics include character count, firstness (which gender is mentioned first), and TF-IDF word associations by gender. The analysis also identifies gender patterns in proper names appearing in TF-IDF word lists, tests whether large language models can distinguish between gendered word lists, and uses GloVe embeddings to examine how closely keywords associate with each gender. Results show consistent overrepresentation of male characters in terms of count, firstness, and named entities. All regions exhibit gender inequality,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGender Studies in Language

MethodsGloVe Embeddings