Beyond Dependencies: The Role of Copy-Based Reuse in Open Source Software Development
Mahmoud Jahanshahi, David Reid, Audris Mockus

TL;DR
This paper investigates the prevalence and factors influencing copy-based reuse in open source software, highlighting its commonality, developer awareness, and the impact of project size and language on reuse practices.
Contribution
It provides the first systematic measurement and analysis of copy-based reuse in open source, introducing a method to identify reused artifacts and examining developer motivations.
Findings
Copy-based reuse is widespread among open source projects.
Reuse propensity varies significantly across programming languages and file types.
Files from popular projects are more likely to be reused, but many come from smaller projects.
Abstract
In Open Source Software, resources of any project are open for reuse by introducing dependencies or copying the resource itself. In contrast to dependency-based reuse, the infrastructure to systematically support copy-based reuse appears to be entirely missing. Our aim is to enable future research and tool development to increase efficiency and reduce the risks of copy-based reuse. We seek a better understanding of such reuse by measuring its prevalence and identifying factors affecting the propensity to reuse. To identify reused artifacts and trace their origins, our method exploits World of Code infrastructure. We begin with a set of theory-derived factors related to the propensity to reuse, sample instances of different reuse types, and survey developers to better understand their intentions. Our results indicate that copy-based reuse is common, with many developers being aware of it…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
