Revealing the value of Repository Centrality in lifespan prediction of Open Source Software Projects
Runzhi He, Hengzhi Ye, Minghui Zhou

TL;DR
This paper introduces a new centrality metric based on user-repository networks to predict open source project deprecation, demonstrating its effectiveness in lifespan prediction models and highlighting its importance in ecosystem health monitoring.
Contribution
It proposes repository centrality using HITS weights and integrates it into lifespan prediction models, providing a novel approach to assess project health and deprecation risk.
Findings
Repository centrality correlates with deprecation risk.
Drop in centrality indicates increased deprecation likelihood.
Models using centrality achieve high prediction accuracy.
Abstract
Background: Open Source Software is the building block of modern software. However, the prevalence of project deprecation in the open source world weakens the integrity of the downstream systems and the broad ecosystem. Therefore it calls for efforts in monitoring and predicting project deprecations, empowering stakeholders to take proactive measures. Challenge: Existing techniques mainly focus on static features on a point in time to make predictions, resulting in limited effects. Goal: We propose a novel metric from the user-repository network, and leverage the metric to fit project deprecation predictors and prove its real-life implications. Method: We establish a comprehensive dataset containing 103,354 non-fork GitHub OSS projects spanning from 2011 to 2023. We propose repository centrality, a family of HITS weights that captures shifts in the popularity of a repository in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data and Business Intelligence
