Patch2Vuln: Agentic Reconstruction of Vulnerabilities from Linux Distribution Binary Patches
Isaac David, Arthur Gervais

TL;DR
This paper presents Patch2Vuln, a pipeline that uses language models and binary analysis to reconstruct security vulnerabilities from Linux binary patches, demonstrating promising results on Ubuntu packages.
Contribution
It introduces a novel local, binary-based vulnerability reconstruction method using language models and binary differencing tools, advancing binary security analysis.
Findings
Successfully localized security-relevant functions in 50% of security update pairs.
Correctly identified root-cause classes in 55% of security update pairs.
Binary differencing and function ranking are key limiting factors.
Abstract
Security updates create a short but important window in which defenders and attackers can compare vulnerable and patched software. Yet in many operational settings, the most accessible artifacts are binary packages rather than source patches or advisory text. This paper asks whether a language-model agent, restricted to local binary-derived evidence, can reconstruct the security meaning of Linux distribution updates. Patch2Vuln is a local, resumable pipeline that extracts old/new ELF pairs, diffs them with Ghidra and Ghidriff, ranks changed functions, builds candidate dossiers, and asks an offline agent to produce a preliminary audit, bounded validation plan, and final audit. We evaluate Patch2Vuln on 25 Ubuntu `.deb` package pairs: 20 security-update pairs and five negative controls, all manually adjudicated against private source-patch and binary-function ground truth. The agent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
