Decompiling Smart Contracts with a Large Language Model

Isaac David; Liyi Zhou; Dawn Song; Arthur Gervais; Kaihua Qin

arXiv:2506.19624·cs.CR·June 25, 2025

Decompiling Smart Contracts with a Large Language Model

Isaac David, Liyi Zhou, Dawn Song, Arthur Gervais, Kaihua Qin

PDF

Open Access 2 Models

TL;DR

This paper introduces a novel decompilation pipeline that uses Large Language Models to convert Ethereum bytecode into human-readable Solidity code, significantly improving over traditional methods in accuracy and readability.

Contribution

It presents the first successful use of LLMs for semantic decompilation of EVM bytecode, combining static analysis and fine-tuned models for high-quality code recovery.

Findings

01

Achieved an average semantic similarity of 0.82 with original source code.

02

Outperformed traditional decompilers in code readability and accuracy.

03

Demonstrated practical application through a publicly accessible system.

Abstract

The widespread lack of broad source code verification on blockchain explorers such as Etherscan, where despite 78,047,845 smart contracts deployed on Ethereum (as of May 26, 2025), a mere 767,520 (< 1%) are open source, presents a severe impediment to blockchain security. This opacity necessitates the automated semantic analysis of on-chain smart contract bytecode, a fundamental research challenge with direct implications for identifying vulnerabilities and understanding malicious behavior. Prevailing decompilers struggle to reverse bytecode in a readable manner, often yielding convoluted code that critically hampers vulnerability analysis and thwarts efforts to dissect contract functionalities for security auditing. This paper addresses this challenge by introducing a pioneering decompilation pipeline that, for the first time, successfully leverages Large Language Models (LLMs) to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFinTech, Crowdfunding, Digital Finance