Wikigender: A Machine Learning Model to Detect Gender Bias in Wikipedia
Natalie Bol\'on Brun, Sofia Kypraiou, Natalia Gull\'on Alt\'es, Irene, Petlacalco Barrios

TL;DR
This paper presents a machine learning approach to detect gender bias in Wikipedia biographies by analyzing adjectives and topics, revealing gender stereotypes and subjective portrayals.
Contribution
It introduces a novel ML model that identifies gender bias in Wikipedia text, focusing on adjectives and topics to uncover stereotypes and subjective language.
Findings
Women are described with more subjective adjectives.
Women are associated with family topics.
Men are linked to business and sports.
Abstract
The way Wikipedia's contributors think can influence how they describe individuals resulting in a bias based on gender. We use a machine learning model to prove that there is a difference in how women and men are portrayed on Wikipedia. Additionally, we use the results of the model to obtain which words create bias in the overview of the biographies of the English Wikipedia. Using only adjectives as input to the model, we show that the adjectives used to portray women have a higher subjectivity than the ones used to describe men. Extracting topics from the overview using nouns and adjectives as input to the model, we obtain that women are related to family while men are related to business and sports.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWikis in Education and Collaboration · Hate Speech and Cyberbullying Detection · Digital Games and Media
