FaultLine: Automated Proof-of-Vulnerability Generation Using LLM Agents
Vikram Nitin, Baishakhi Ray, Roshanak Zilouchian Moghaddam

TL;DR
FaultLine is an innovative LLM-based workflow that automatically generates proof-of-vulnerability tests by reasoning about control flow and data paths, improving over existing methods across multiple programming languages.
Contribution
This paper introduces FaultLine, a novel LLM agent framework that automates PoV test generation without language-specific static or dynamic analysis, achieving significant performance improvements.
Findings
FaultLine successfully generated PoV tests for 16 out of 100 vulnerabilities.
It outperformed the state-of-the-art CodeAct 2.1 by 77%.
Hierarchical reasoning enhances LLM performance in vulnerability testing.
Abstract
Despite the critical threat posed by software security vulnerabilities, reports are often incomplete, lacking the proof-of-vulnerability (PoV) tests needed to validate fixes and prevent regressions. These tests are crucial not only for ensuring patches work, but also for helping developers understand how vulnerabilities can be exploited. Generating PoV tests is a challenging problem, requiring reasoning about the flow of control and data through deeply nested levels of a program. We present FaultLine, an LLM agent workflow that uses a set of carefully designed reasoning steps, inspired by aspects of traditional static and dynamic program analysis, to automatically generate PoV test cases. Given a software project with an accompanying vulnerability report, FaultLine 1) traces the flow of an input from an externally accessible API ("source") to the "sink" corresponding to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
