Unsafe and Unused? A History of Utility Code in Mature Open Source Projects
Brandon Keller, Kaitlin Yandik, Angela Ngo, Andy Meneely

TL;DR
This study investigates the evolution, usage, and security implications of 'util' files in mature open source projects through a longitudinal analysis of seven large repositories.
Contribution
It provides empirical insights into how util files are maintained, reused, or abandoned over time, highlighting their potential security risks and developer practices.
Findings
Util files can be up to 2.75 times more likely to have vulnerabilities.
Longitudinal analysis shows util files' usage and maintenance patterns over project histories.
Util files persist for long periods, indicating their socio-technical significance.
Abstract
Filenames are a concise means of conveying information about source code to fellow developers. One such convention is util. Commonly understood to stand for "utility", filenames with the letters util are often an indication that the file contains code that may be broadly useful or reusable. Some projects use this convention heavily, for example, the Apache Tomcat server contains 925 files with util in the path name, which is 17.9% of all source code files in the tree. While the intent of the name may be to prevent duplicate code and reduce workload, what actually happens to util code over time? Do projects move away from util code as they mature? Are util files being used by fellow colleagues, or maintained and used by their author? The goal of our work is to help developers avoid creating unsafe and unused util files when developing their projects. We conducted a longitudinal mining…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
