Predicting long time contributors with knowledge units of programming languages: an empirical study
Md Ahasanuzzaman, Gustavo A. Oliva, and Ahmed E. Hassan

TL;DR
This empirical study introduces KULTC, a model leveraging programming language knowledge units to predict long-time contributors in Java projects, outperforming existing models and aiding retention strategies.
Contribution
The paper presents a novel KU-based prediction model for LTCs, demonstrating its effectiveness and superiority over state-of-the-art approaches in Java open-source projects.
Findings
KULTC achieves median AUC of at least 0.75.
KULTC outperforms BAOLTC significantly.
Combining KULTC with BAOLTC improves performance by 16.5%.
Abstract
Predicting potential long-time contributors (LTCs) early allows project maintainers to effectively allocate resources and mentoring to enhance their development and retention. Mapping programming language expertise to developers and characterizing projects in terms of how they use programming languages can help identify developers who are more likely to become LTCs. However, prior studies on predicting LTCs do not consider programming language skills. This paper reports an empirical study on the usage of knowledge units (KUs) of the Java programming language to predict LTCs. A KU is a cohesive set of key capabilities that are offered by one or more building blocks of a given programming language. We build a prediction model called KULTC, which leverages KU-based features along five different dimensions. We detect and analyze KUs from the studied 75 Java projects (353K commits and 168K…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWikis in Education and Collaboration · Software Engineering Research · Open Source Software Innovations
