Leveraging Large Language Models and Machine Learning for Smart Contract   Vulnerability Detection

S M Mostaq Hossain; Amani Altarawneh; Jesse Roberts

arXiv:2501.02229·cs.CR·March 11, 2025

Leveraging Large Language Models and Machine Learning for Smart Contract Vulnerability Detection

S M Mostaq Hossain, Amani Altarawneh, Jesse Roberts

PDF

Open Access

TL;DR

This paper demonstrates that fine-tuned large language models significantly outperform traditional machine learning models in detecting various vulnerabilities in smart contract code, achieving over 90% accuracy and enhancing blockchain security.

Contribution

It introduces a novel approach of fine-tuning large language models specifically for smart contract vulnerability detection, surpassing existing benchmarks and providing insights into model strengths.

Findings

01

Fine-tuned LLMs achieve over 90% accuracy in vulnerability detection.

02

LLMs outperform traditional ML models in identifying subtle code vulnerabilities.

03

The approach enhances the security and robustness of smart contracts in blockchain systems.

Abstract

As blockchain technology and smart contracts become widely adopted, securing them throughout every stage of the transaction process is essential. The concern of improved security for smart contracts is to find and detect vulnerabilities using classical Machine Learning (ML) models and fine-tuned Large Language Models (LLM). The robustness of such work rests on a labeled smart contract dataset that includes annotated vulnerabilities on which several LLMs alongside various traditional machine learning algorithms such as DistilBERT model is trained and tested. We train and test machine learning algorithms to classify smart contract codes according to vulnerability types in order to compare model performance. Having fine-tuned the LLMs specifically for smart contract code classification should help in getting better results when detecting several types of well-known vulnerabilities, such as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Law · Cybercrime and Law Enforcement Studies · Border Security and International Relations