Logic Meets Magic: LLMs Cracking Smart Contract Vulnerabilities

ZeKe Xiao; Qin Wang; Hammond Pearce; Shiping Chen

arXiv:2501.07058·cs.CR·January 14, 2025

Logic Meets Magic: LLMs Cracking Smart Contract Vulnerabilities

ZeKe Xiao, Qin Wang, Hammond Pearce, Shiping Chen

PDF

TL;DR

This paper evaluates the effectiveness of recent LLMs in detecting smart contract vulnerabilities in Solidity v0.8, highlighting improvements in false-positive reduction and challenges in recall rates due to new library dependencies.

Contribution

It provides the first comprehensive evaluation of LLM-based vulnerability detection on Solidity v0.8 using the latest models and proposes prompt design techniques to reduce false positives.

Findings

01

Prompt design reduces false positives by over 60%.

02

Recall rate for some vulnerabilities drops to 13% in Solidity v0.8.

03

Detection relies heavily on identifying changes in libraries and frameworks.

Abstract

Smart contract vulnerabilities caused significant economic losses in blockchain applications. Large Language Models (LLMs) provide new possibilities for addressing this time-consuming task. However, state-of-the-art LLM-based detection solutions are often plagued by high false-positive rates. In this paper, we push the boundaries of existing research in two key ways. First, our evaluation is based on Solidity v0.8, offering the most up-to-date insights compared to prior studies that focus on older versions (v0.4). Second, we leverage the latest five LLM models (across companies), ensuring comprehensive coverage across the most advanced capabilities in the field. We conducted a series of rigorous evaluations. Our experiments demonstrate that a well-designed prompt can reduce the false-positive rate by over 60%. Surprisingly, we also discovered that the recall rate for detecting some…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsFocus