Inferring gender from name: a large scale performance evaluation study
Kriste Krstovski, Yao Lu, Ye Xu

TL;DR
This study systematically evaluates existing name-to-gender inference methods on large datasets and introduces two hybrid approaches that outperform current techniques in accuracy.
Contribution
It provides a comprehensive performance comparison of existing methods and proposes two novel hybrid approaches that improve inference accuracy.
Findings
Hybrid approaches outperform single methods
Large-scale datasets enable robust evaluation
New methods achieve higher accuracy
Abstract
A person's gender is a crucial piece of information when performing research across a wide range of scientific disciplines, such as medicine, sociology, political science, and economics, to name a few. However, in increasing instances, especially given the proliferation of big data, gender information is not readily available. In such cases researchers need to infer gender from readily available information, primarily from persons' names. While inferring gender from name may raise some ethical questions, the lack of viable alternatives means that researchers have to resort to such approaches when the goal justifies the means - in the majority of such studies the goal is to examine patterns and determinants of gender disparities. The necessity of name-to-gender inference has generated an ever-growing domain of algorithmic approaches and software products. These approaches have been used…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuthorship Attribution and Profiling · Names, Identity, and Discrimination Research
