Towards Mining OSS Skills from GitHub Activity
Jenny T. Liang, Thomas Zimmermann, Denae Ford

TL;DR
This paper introduces Disko, a novel tool that detects open source software skills from GitHub activity signals, enabling better contributor assessment and project role assignment.
Contribution
The paper presents a new method and prototype tool, Disko, for detecting OSS skills from activity signals, which was not previously available.
Findings
Disko detects skills with 77%-97% precision.
Over 54% of contributors show high-proficiency skills.
The approach can improve collaborator identification and role assignment.
Abstract
Open source software (OSS) development relies on diverse skill sets. However, to our knowledge, there are no tools which detect OSS-related skills. In this paper, we present a novel method to detect OSS skills and prototype it in a tool called Disko. Our approach relies on identifying relevant signals, which are measurable activities or cues associated with a skill. Our tool detects how contributors 1) teach others to be involved in OSS projects, 2) show commitment towards an OSS project, 3) have knowledge in specific programming languages, and 4) are familiar with OSS practices. We then evaluate the tool by administering a survey to 455 OSS contributors. We demonstrate that Disko yields promising results: it detects the presence of these skills with precision scores between 77% to 97%. We also find that over 54% of participants would display their high-proficiency skills. Our approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Open Source Software Innovations · Software Engineering Techniques and Practices
