WaDec: Decompiling WebAssembly Using Large Language Model
Xinyu She, Yanjie Zhao, Haoyu Wang

TL;DR
WaDec introduces a fine-tuned large language model approach for decompiling WebAssembly into more readable source code, significantly outperforming existing tools in accuracy, recompileability, and similarity metrics.
Contribution
This paper presents the first use of a fine-tuned LLM for WebAssembly decompilation, improving readability and accuracy over traditional and existing LLM-based decompilers.
Findings
Achieves a 3.34% code inflation rate, 97% lower than prior tools.
Maintains a 52.11% recompilability rate and 43.55% re-execution rate.
Outperforms state-of-the-art in AST similarity, cyclomatic complexity, and cosine similarity.
Abstract
WebAssembly (abbreviated Wasm) has emerged as a cornerstone of web development, offering a compact binary format that allows high-performance applications to run at near-native speeds in web browsers. Despite its advantages, Wasm's binary nature presents significant challenges for developers and researchers, particularly regarding readability when debugging or analyzing web applications. Therefore, effective decompilation becomes crucial. Unfortunately, traditional decompilers often struggle with producing readable outputs. While some large language model (LLM)-based decompilers have shown good compatibility with general binary files, they still face specific challenges when dealing with Wasm. In this paper, we introduce a novel approach, WaDec, which is the first use of a fine-tuned LLM to interpret and decompile Wasm binary code into a higher-level, more comprehensible source code…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsManufacturing Process and Optimization
