Gender assignment in doctoral theses: revisiting Teseo with a method based on cultural consensus theory
Nataly Matias-Rayme, Iuliana Botezan, Mari Carmen Su\'arez-Figueroa, and Rodrigo S\'anchez-Jim\'enez

TL;DR
This paper evaluates various gender assignment methods for doctoral theses, introducing a cultural consensus theory-based classifier that improves accuracy and transparency over traditional techniques, with implications for gender studies and academic demographics.
Contribution
The study introduces nomquamgender, a novel cultural consensus-based method for gender assignment, and applies it to a large Spanish dissertation database, demonstrating its advantages over existing methods.
Findings
Significant reduction in unknown gender assignments using the new method
Methodological choices greatly affect gender distribution results
Gender imbalances persist in doctoral data, especially in STEM fields
Abstract
This study critically evaluates gender assignment methods within academic contexts, employing a comparative analysis of diverse techniques, including a SVM classifier, gender-guesser, genderize.io, and a Cultural Consensus Theory based classifier. Emphasizing the significance of transparency, data sources, and methodological considerations, the research introduces nomquamgender, a cultural consensus-based method, and applies it to Teseo, a Spanish dissertation database. The results reveal a substantial reduction in the number of individuals with unknown gender compared to traditional methods relying on INE data. The nuanced differences in gender distribution underscore the importance of methodological choices in gender studies, urging for transparent, comprehensive, and freely accessible methods to enhance the accuracy and reliability of gender assignment in academic research. After…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
