Reentrancy Detection in the Age of LLMs
Dalila Ressi, Alvise Span\`o, Matteo Rizzo, Lorenzo Benetollo, Sabina Rossi

TL;DR
This paper evaluates the effectiveness of static analyzers, machine learning models, and large language models in detecting reentrancy vulnerabilities in modern Ethereum smart contracts, revealing LLMs' superior performance.
Contribution
It introduces two new benchmarks for reentrancy detection in Solidity 0.8+ and systematically compares traditional tools, ML models, and LLMs on these datasets.
Findings
LLMs outperform traditional tools and ML models in reentrancy detection.
Top LLMs achieve up to 0.96 F1 score in zero-shot settings.
Existing tools show significant gaps in robustness on modern smart contracts.
Abstract
Reentrancy remains one of the most critical classes of vulnerabilities in Ethereum smart contracts, yet widely used detection tools and datasets continue to reflect outdated patterns and obsolete Solidity versions. This paper adopts a dependability-oriented perspective on reentrancy detection in Solidity 0.8+, assessing how reliably state-of-the-art static analyzers and AI-based techniques operate on modern code by putting them to the test on two fronts. We construct two manually verified benchmarks: an Aggregated Benchmark of 432 real-world contracts, consolidated and relabeled from prior datasets, and a Reentrancy Scenarios Dataset (RSD) of \chadded{143} handcrafted minimal working examples designed to isolate and stress-test individual reentrancy patterns. We then evaluate 12 formal-methods-based tools, 10 machine-learning models, and 9 large language models. On the Aggregated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
