TL;DR
This paper introduces Smart-LLaMA-DPO, a reinforced large language model tailored for explainable smart contract vulnerability detection, addressing dataset limitations and interpretation issues in blockchain security.
Contribution
It develops a comprehensive dataset, applies continual pre-training, supervised fine-tuning, and direct preference optimization to enhance LLM performance in vulnerability detection and explanation quality.
Findings
Significant improvement in detection accuracy and F1 score.
More accurate, thorough, and clear explanations from the model.
Outperforms state-of-the-art baselines in multiple vulnerability types.
Abstract
Smart contract vulnerability detection remains a major challenge in blockchain security. Existing vulnerability detection methods face two main issues: (1) Existing datasets lack comprehensive coverage and high-quality explanations for preference learning. (2) Large language models (LLMs) often struggle with accurately interpreting specific concepts in smart contract security. Empirical analysis shows that even after continual pre-training (CPT) and supervised fine-tuning (SFT), LLMs may misinterpret the execution order of state changes, resulting in incorrect explanations despite making correct detection decisions. To address these challenges, we propose Smart-LLaMA-DPO based on LLaMA-3.1-8B. We construct a comprehensive dataset covering four major vulnerability types and machine-unauditable vulnerabilities, including precise labels, explanations, and locations for SFT, as well as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDirect Preference Optimization · Shrink and Fine-Tune
