Smart-LLaMA: Two-Stage Post-Training of Large Language Models for Smart Contract Vulnerability Detection and Explanation
Lei Yu, Shiqi Chen, Hang Yuan, Peng Wang, Zhirong Huang, Jingyuan, Zhang, Chenjie Shen, Fengjun Zhang, Li Yang, Jiajia Ma

TL;DR
Smart-LLaMA is a novel two-stage post-training approach for large language models that improves smart contract vulnerability detection and explanation by domain-specific pre-training and explanation-guided fine-tuning, outperforming existing methods.
Contribution
It introduces a comprehensive dataset, smart contract-specific continual pre-training, and explanation-guided fine-tuning for LLMs in smart contract security.
Findings
Outperforms state-of-the-art baselines in detection accuracy
Provides reliable and detailed vulnerability explanations
Achieves 6.49% F1 score and 3.78% accuracy improvements
Abstract
With the rapid development of blockchain technology, smart contract security has become a critical challenge. Existing smart contract vulnerability detection methods face three main issues: (1) Insufficient quality of datasets, lacking detailed explanations and precise vulnerability locations. (2) Limited adaptability of large language models (LLMs) to the smart contract domain, as most LLMs are pre-trained on general text data but minimal smart contract-specific data. (3) Lack of high-quality explanations for detected vulnerabilities, as existing methods focus solely on detection without clear explanations. These limitations hinder detection performance and make it harder for developers to understand and fix vulnerabilities quickly, potentially leading to severe financial losses. To address these problems, we propose Smart-LLaMA, an advanced detection method based on the LLaMA language…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · FinTech, Crowdfunding, Digital Finance · Insurance and Financial Risk Management
MethodsLLaMA · Focus
