Self-Constructed Context Decompilation with Fined-grained Alignment Enhancement
Yunlong Feng, Dechuan Teng, Yang Xu, Honglin Mu, Xiao Xu, Libo Qin,, Qingfu Zhu, Wanxiang Che

TL;DR
This paper introduces a novel decompilation approach combining self-constructed context and fine-grained alignment, significantly improving performance by 3.9% and setting new state-of-the-art results.
Contribution
It proposes two innovative methods, sc$^2$dec and FAE, to enhance decompilation accuracy without extensive fine-tuning or large-scale data.
Findings
Achieved 52.41% re-executability on Decompile-Eval benchmark.
Improved decompilation performance by approximately 3.90%.
Established new state-of-the-art results.
Abstract
Decompilation transforms compiled code back into a high-level programming language for analysis when source code is unavailable. Previous work has primarily focused on enhancing decompilation performance by increasing the scale of model parameters or training data for pre-training. Based on the characteristics of the decompilation task, we propose two methods: (1) Without fine-tuning, the Self-Constructed Context Decompilation (scdec) method recompiles the LLM's decompilation results to construct pairs for in-context learning, helping the model improve decompilation performance. (2) Fine-grained Alignment Enhancement (FAE), which meticulously aligns assembly code with source code at the statement level by leveraging debugging information, is employed during the fine-tuning phase to achieve further improvements in decompilation. By integrating these two methods, we achieved a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWeb Data Mining and Analysis · Context-Aware Activity Recognition Systems · Machine Learning and Data Classification
