Gender Differences in Public Code Contributions: a 50-year Perspective
Stefano Zacchiroli (UP, Inria, DGD-I)

TL;DR
This study analyzes 50 years of open source software contributions to understand gender disparities, revealing a slow but steady increase in female participation, indicating potential for greater gender balance in the future.
Contribution
It provides the first large-scale longitudinal analysis of gender trends in open source contributions over five decades, using a dataset of 1.6 billion commits.
Findings
Female contribution proportion is increasing over time.
Overall female contribution remains low but shows a positive trend.
Long-term data suggests potential for improved gender balance.
Abstract
Gender imbalance in information technology in general, and Free/Open Source Software specifically, is a well-known problem in the field. Still, little is known yet about the large-scale extent and long-term trends that underpin the phenomenon. We contribute to fill this gap by conducting a longitudinal study of the population of contributors to publicly available software source code. We analyze 1.6 billion commits corresponding to the development history of 120 million projects, contributed by 33 million distinct authors over a period of 50 years. We classify author names by gender and study their evolution over time.We show that, while the amount of commits by female authors remains low overall, there is evidence of a stable long-term increase in their proportion over all contributions, providing hope of a more gender-balanced future for collaborative software development.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
