Refining Decompiled C Code with Large Language Models
Wai Kin Wong, Huaijin Wang, Zongjie Li, Zhibo Liu, Shuai Wang, Qiyi, Tang, Sen Nie, Shi Wu

TL;DR
This paper explores using large language models to improve decompiler outputs, aiming to produce recompilable C code from executables, which could significantly enhance reverse engineering workflows.
Contribution
It introduces a novel two-step hybrid approach leveraging LLMs to augment decompiler outputs for recompilability, addressing a key challenge in reverse engineering.
Findings
Achieved over 75% recompilation success rate with LLM augmentation
Demonstrated that original decompiler outputs are largely non-recompilable
Identified obstacles and potential solutions for automating recompilation
Abstract
A C decompiler converts an executable into source code. The recovered C source code, once re-compiled, is expected to produce an executable with the same functionality as the original executable. With over twenty years of development, C decompilers have been widely used in production to support reverse engineering applications. Despite the prosperous development of C decompilers, it is widely acknowledged that decompiler outputs are mainly used for human consumption, and are not suitable for automatic recompilation. Often, a substantial amount of manual effort is required to fix the decompiler outputs before they can be recompiled and executed properly. This paper is motived by the recent success of large language models (LLMs) in comprehending dense corpus of natural language. To alleviate the tedious, costly and often error-prone manual effort in fixing decompiler outputs, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Topic Modeling · Natural Language Processing Techniques
