SkillScope: A Tool to Predict Fine-Grained Skills Needed to Solve Issues on GitHub
Benjamin C. Carter, Jonathan Rivas Contreras, Carlos A. Llanes, Villegas, Pawan Acharya, Jack Utzerath, Adonijah O. Farner, Hunter Jenkins,, Dylan Johnson, Jacob Penney, Igor Steinmacher, Marco A. Gerosa, Fabio Santos

TL;DR
SkillScope is a novel tool that leverages large language models and machine learning to predict detailed, multilevel programming skills needed for GitHub issues, aiding new contributors in OSS projects.
Contribution
This paper introduces SkillScope, a new tool combining LLMs and Random Forests to predict detailed skills for GitHub issues, surpassing prior label-based approaches.
Findings
Achieved 91% precision in skill prediction
Predicted 217 multilevel skills per issue
Demonstrated effectiveness on Java GitHub projects
Abstract
New contributors often struggle to find tasks that they can tackle when onboarding onto a new Open Source Software (OSS) project. One reason for this difficulty is that issue trackers lack explanations about the knowledge or skills needed to complete a given task successfully. These explanations can be complex and time-consuming to produce. Past research has partially addressed this problem by labeling issues with issue types, issue difficulty level, and issue skills. However, current approaches are limited to a small set of labels and lack in-depth details about their semantics, which may not sufficiently help contributors identify suitable issues. To surmount this limitation, this paper explores large language models (LLMs) and Random Forest (RF) to predict the multilevel skills required to solve the open issues. We introduce a novel tool, SkillScope, which retrieves current issues…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOnline Learning and Analytics · Innovative Teaching and Learning Methods · Software System Performance and Reliability
MethodsSparse Evolutionary Training
