TL;DR
This paper introduces a method to trace the propagation of security vulnerabilities in open source software by analyzing code duplication and reuse across repositories using the World of Code infrastructure.
Contribution
It presents a novel approach and tools for identifying open source projects that reuse vulnerable code files and their previous revisions.
Findings
Identified widespread reuse of vulnerable code in open source repositories.
Developed a scalable method to trace code lineage and vulnerability propagation.
Enabled detection of vulnerable code across multiple project versions.
Abstract
This paper presents results from the MSR 2021 Hackathon. Our team investigates files/projects that contain known security vulnerabilities and how widespread they are throughout repositories in open source software. These security vulnerabilities can potentially be propagated through code reuse even when the vulnerability is fixed in different versions of the code. We utilize the World of Code infrastructure to discover file-level duplication of code from a nearly complete collection of open source software. This paper describes a method and set of tools to find all open source projects that use known vulnerable files and any previous revisions of those files.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
