Timelines for In-Code Discovery of Zero-Day Vulnerabilities and Supply-Chain Attacks
Andrew J. Lohn

TL;DR
This paper analyzes how long zero-day vulnerabilities remain undiscovered in code by examining version changes in open source software and modeling the likelihood of their discovery over time.
Contribution
It introduces a simple model to estimate the discoverability of vulnerabilities based on code revision patterns across multiple software projects.
Findings
Much of code revision behavior can be captured with a simple model.
The bounds for in-code discoverability range from hidden to obvious vulnerabilities.
Analysis covers over a billion lines of code across 87 software versions.
Abstract
Zero-day vulnerabilities can be accidentally or maliciously placed in code and can remain in place for years. In this study, we address an aspect of their longevity by considering the likelihood that they will be discovered in the code across versions. We approximate well-disguised vulnerabilities as only being discoverable if the relevant lines of code are explicitly examined, and obvious vulnerabilities as being discoverable if any part of the relevant file is examined. We analyze the version-to-version changes in three types of open source software (Mozilla Firefox, GNU/Linus, and glibc) to understand the rate at which the various pieces of code are amended and find that much of the revision behavior can be captured with a simple intuitive model. We use that model and the data from over a billion unique lines of code in 87 different versions of software to specify the bounds for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Advanced Malware Detection Techniques · Software Reliability and Analysis Research
