Two Approaches to Survival Analysis of Open Source Python Projects
Derek Robinson, Keanelek Enns, Neha Koulecar, and Manish Sihag

TL;DR
This paper replicates a survival analysis study on open source Python projects using both frequentist and Bayesian methods, confirming key attributes that influence project longevity.
Contribution
It introduces Bayesian survival analysis to the existing frequentist approach and examines an additional project attribute for a more comprehensive understanding.
Findings
Projects with major releases tend to survive longer.
Multiple hosting repositories increase project longevity.
Frequent revisions and larger developer teams correlate with higher survival chances.
Abstract
A recent study applied frequentist survival analysis methods to a subset of the Software Heritage Graph and determined which attributes of an OSS project contribute to its health. This paper serves as an exact replication of that study. In addition, Bayesian survival analysis methods were applied to the same dataset, and an additional project attribute was studied to serve as a conceptual replication. Both analyses focus on the effects of certain attributes on the survival of open-source software projects as measured by their revision activity. Methods such as the Kaplan-Meier estimator, Cox Proportional-Hazards model, and the visualization of posterior survival functions were used for each of the project attributes. The results show that projects which publish major releases, have repositories on multiple hosting services, possess a large team of developers, and make frequent revisions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
