Quality Classifiers for Open Source Software Repositories
George Tsatsaronis, Maria Halkidi, Emmanouel A. Giakoumakis

TL;DR
This paper presents a data mining approach that trains classifiers on OSS repository meta-data to predict project success, aiding in identifying promising open source projects for further development.
Contribution
It introduces a novel classifier training method using OSS meta-data to predict project success, which is a new application in open source software analysis.
Findings
Classifiers can effectively predict OSS project success.
Meta-data features are strong indicators of project continuation.
The approach aids in early identification of promising OSS projects.
Abstract
Open Source Software (OSS) often relies on large repositories, like SourceForge, for initial incubation. The OSS repositories offer a large variety of meta-data providing interesting information about projects and their success. In this paper we propose a data mining approach for training classifiers on the OSS meta-data provided by such data repositories. The classifiers learn to predict the successful continuation of an OSS project. The `successfulness' of projects is defined in terms of the classifier confidence with which it predicts that they could be ported in popular OSS projects (such as FreeBSD, Gentoo Portage).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Imbalanced Data Classification Techniques · Data Stream Mining Techniques
