Towards Identifying Paid Open Source Developers - A Case Study with Mozilla Developers
Ma\"elick Claes, Mika M\"antyl\"a, Miikka Kuutila, Umar Farooq

TL;DR
This study develops a machine learning approach to distinguish paid from volunteer open source developers using Mozilla data, achieving higher accuracy than previous automatic methods.
Contribution
It introduces a novel classification method based on commit patterns and meta-data, improving the automatic identification of paid developers in open source projects.
Findings
Random forest achieves 0.75 AUC in identifying paid developers.
The proposed method outperforms previous automatic techniques.
Commit time patterns are effective indicators of developer employment status.
Abstract
Open source development contains contributions from both hired and volunteer software developers. Identification of this status is important when we consider the transferability of research results to the closed source software industry, as they include no volunteer developers. While many studies have taken the employment status of developers into account, this information is often gathered manually due to the lack of accurate automatic methods. In this paper, we present an initial step towards predicting paid and unpaid open source development using machine learning and compare our results with automatic techniques used in prior work. By relying on code source repository meta-data from Mozilla, and manually collected employment status, we built a dataset of the most active developers, both volunteer and hired by Mozilla. We define a set of metrics based on developers' usual commit time…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOpen Source Software Innovations · Software Engineering Research · Wikis in Education and Collaboration
