Run Like a Girl! Sports-Related Gender Bias in Language and Vision
Sophia Harrison, Eleonora Gualdoni, Gemma Boleda

TL;DR
This paper investigates gender bias in language and vision datasets and models, revealing underrepresentation of women and biased naming practices that reinforce stereotypes, with implications for harmful societal impacts.
Contribution
It provides a comprehensive analysis of gender bias in datasets and models, highlighting underrepresentation and biased naming, and demonstrates how these biases perpetuate stereotypes and harm.
Findings
Women are underrepresented in datasets, promoting invisibilization.
Speakers produce more sport-related names for men than women (46% vs. 35%).
A model trained on naming data reproduces gender bias.
Abstract
Gender bias in Language and Vision datasets and models has the potential to perpetuate harmful stereotypes and discrimination. We analyze gender bias in two Language and Vision datasets. Consistent with prior work, we find that both datasets underrepresent women, which promotes their invisibilization. Moreover, we hypothesize and find that a bias affects human naming choices for people playing sports: speakers produce names indicating the sport (e.g. 'tennis player' or 'surfer') more often when it is a man or a boy participating in the sport than when it is a woman or a girl, with an average of 46% vs. 35% of sports-related names for each gender. A computational model trained on these naming data reproduces the bias. We argue that both the data and the model result in representational harm against women.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSports, Gender, and Society · Sports Analytics and Performance · Names, Identity, and Discrimination Research
