FORGE: An LLM-driven Framework for Large-Scale Smart Contract Vulnerability Dataset Construction
Jiachi Chen, Yiming Shen, Jiashuo Zhang, Zihao Li, John Grundy, Zhenzhe Shao, Yanlin Wang, Jiashui Wang, Ting Chen, Zibin Zheng

TL;DR
FORGE introduces an automated, LLM-driven framework for constructing large-scale, high-quality smart contract vulnerability datasets, addressing manual annotation limitations and standardizing vulnerability classification.
Contribution
It is the first automated approach leveraging LLMs to extract and classify vulnerabilities from audit reports into CWE categories, improving dataset scale and consistency.
Findings
Generated dataset contains 81,390 Solidity files and 27,497 vulnerabilities.
High extraction precision of 95.6% and inter-rater agreement of 0.87.
Benchmarking reveals limitations of existing security tools.
Abstract
High-quality smart contract vulnerability datasets are critical for evaluating security tools and advancing smart contract security research. Two major limitations of current manual dataset construction are (1) labor-intensive and error-prone annotation processes limiting the scale, quality, and evolution of the dataset, and (2) absence of standardized classification rules results in inconsistent vulnerability categories and labeling results across different datasets. To address these limitations, we present FORGE, the first automated approach for constructing smart contract vulnerability datasets. FORGE leverages an LLM-driven pipeline to extract high-quality vulnerabilities from real-world audit reports and classify them according to the CWE, the most widely recognized classification in software security. FORGE employs a divide-and-conquer strategy to extract structured and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
