Beneath the Mask: Can Contribution Data Unveil Malicious Personas in Open-Source Projects?
Ruby Nealon

TL;DR
This paper explores how analyzing contribution data from open-source projects using graph-based methods can detect malicious personas, exemplified by a backdoor developer on GitHub, to improve security and trust in open-source communities.
Contribution
It introduces a novel approach using OSINT data and graph analysis to identify malicious contributors across multiple open-source projects, addressing a gap in current tooling.
Findings
Graph analysis revealed anomalous contribution patterns.
The method successfully identified the malicious persona across projects.
Potential for real-time monitoring of contributor behavior.
Abstract
In February 2024, after building trust over two years with project maintainers by making a significant volume of legitimate contributions, GitHub user "JiaT75" self-merged a version of the XZ Utils project containing a highly sophisticated, well-disguised backdoor targeting sshd processes running on systems with the backdoored package installed. A month later, this package began to be distributed with popular Linux distributions until a Microsoft employee discovered the backdoor while investigating how a recent system upgrade impacted the performance of SSH authentication. Despite its potential global impact, no tooling exists for monitoring and identifying anomalous behavior by personas contributing to other open-source projects. This paper demonstrates how Open Source Intelligence (OSINT) data gathered from GitHub contributions, analyzed using graph databases and graph theory, can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
