Developer Belief vs. Reality: The Case of the Commit Size Distribution
Dirk Riehle, Carsten Kolassa, Michel A. Salim

TL;DR
This paper reveals that developers' beliefs about commit sizes significantly differ from actual data, providing empirical commit size distributions to improve tool design accuracy.
Contribution
It presents the first large-scale empirical analysis of commit size distributions, challenging existing assumptions used in tool development.
Findings
Developers underestimate commit sizes by over an order of magnitude.
Empirical commit size distributions differ markedly from developer beliefs.
Insights can guide better alignment of tools with actual development practices.
Abstract
The design of software development tools follows from what the developers of such tools believe is true about software development. A key aspect of such beliefs is the size of code contributions (commits) to a software project. In this paper, we show that what tool developers think is true about the size of code contributions is different by more than an order of magnitude from reality. We present this reality, called the commit size distribution, for a large sample of open source and selected closed source projects. We suggest that these new empirical insights will help improve software development tools by aligning underlying design assumptions closer with reality.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Open Source Software Innovations · Software Engineering Techniques and Practices
