How Reliable Are FOSS Popularity Metrics? Analyzing the Effort Required for Spoofing Common Software Popularity Metrics
Ben Swierzy, Timo Pohl, Marc Ohm, Michael Meier

TL;DR
This paper investigates the manipulability of FOSS popularity metrics, revealing many can be spoofed with low effort, which questions their reliability for research and practical applications.
Contribution
It decomposes existing metrics into atomic categories, analyzes spoofing effort, and presents a real-world sybil attack on npm, highlighting vulnerabilities in FOSS impact metrics.
Findings
Many metrics, especially commit data and downloads, are easily manipulable.
A sybil attack created over 70,000 spam packages on npm.
Metrics should be used cautiously due to their susceptibility to spoofing.
Abstract
Quantitative metrics derived from software repositories and package ecosystems are widely used to assess the impact, popularity, maintenance, and criticality of free and open source software (FOSS) projects. However, these metrics are often assumed to be reliable despite their potential susceptibility to manipulation. Prior empirical software engineering and security research deployed these in a variety of ways which assume they indeed capture project impact and popularity. Yet, the extent to which these underlying signals can be spoofed in practice, and the consequences this has for downstream uses of the metrics, has received little focused attention. To address this gap, the paper decomposes existing combined metrics into atomic metric categories, analyzes their spoofing effort under a maintainer-centered threat model, and investigates a real-world sybil attack on npm connected to an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
