Six Million (Suspected) Fake Stars in GitHub: A Growing Spiral of Popularity Contests, Spams, and Malware
Hao He, Haoqin Yang, Philipp Burckhardt, Alexandros Kapravelos, Bogdan Vasilescu, Christian K\"astner

TL;DR
This study systematically measures and analyzes the surge of fake stars on GitHub, revealing their patterns, purposes, and short-term promotion effects, with implications for platform security and open-source integrity.
Contribution
Introduces StarScout, a scalable tool for detecting fake stars on GitHub, and provides a comprehensive analysis of fake star activities from 2019 to 2024.
Findings
Fake star activities surged in 2024.
Fake accounts have trivial activity patterns.
Most fake stars promote malware and popular tech repositories.
Abstract
GitHub, the de facto platform for open-source software development, provides a set of social-media-like features to signal high-quality repositories. Among them, the star count is the most widely used popularity signal, but it is also at risk of being artificially inflated (i.e., faked), decreasing its value as a decision-making signal and posing a security risk to all GitHub users. In this paper, we present a systematic, global, and longitudinal measurement study of fake stars in GitHub. To this end, we build StarScout, a scalable tool able to detect anomalous starring behaviors across all GitHub metadata between 2019 and 2024. Analyzing the data collected using StarScout, we find that: (1) fake-star-related activities have rapidly surged in 2024; (2) the accounts and repositories in fake star campaigns have highly trivial activity patterns; (3) the majority of fake stars are used to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Spam and Phishing Detection · Cybercrime and Law Enforcement Studies
