The Empirical Commit Frequency Distribution of Open Source Projects
Carsten Kolassa, Dirk Riehle, Michel A. Salim

TL;DR
This paper provides a detailed quantitative analysis of commit frequencies in open-source projects, revealing distribution patterns and differences across project sizes and success levels to enhance understanding of software development processes.
Contribution
It presents the first comprehensive empirical analysis of commit frequency distributions in open-source projects, including comparisons by project size and success, and introduces an activity indicator.
Findings
Commit frequency distribution varies with project size and success.
Successful projects tend to have higher commit frequencies.
The analysis validates key assumptions about developer activity patterns.
Abstract
A fundamental unit of work in programming is the code contribution ("commit") that a developer makes to the code base of the project in work. An author's commit frequency describes how often that author commits. Knowing the distribution of all commit frequencies is a fundamental part of understanding software development processes. This paper presents a detailed quantitative analysis of commit frequencies in open-source software development. The analysis is based on a large sample of open source projects, and presents the overall distribution of commit frequencies. We analyze the data to show the differences between authors and projects by project size; we also includes a comparison of successful and non successful projects and we derive an activity indicator from these analyses. By measuring a fundamental dimension of programming we help improve software development tools and our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
