FidelityGPT: Correcting Decompilation Distortions with Retrieval Augmented Generation
Zhiping Zhou, Xiaohong Li, Ruitao Feng, Yao Zhang, Yuekang Li, Wenbu Feng, Yunqian Wang, Yuqing Li

TL;DR
FidelityGPT significantly improves the accuracy and readability of decompiled code by detecting and correcting semantic distortions using retrieval-augmented generation and specialized prompt templates.
Contribution
The paper introduces FidelityGPT, a novel framework that enhances decompilation fidelity through distortion detection, retrieval of semantically similar code, and variable dependency analysis, outperforming previous methods.
Findings
Achieved 89% detection accuracy and 83% precision in identifying distortions.
Attained 94% fix rate and 64% corrected fix rate, surpassing state-of-the-art methods.
Demonstrated effectiveness on 620 function pairs from a binary similarity benchmark.
Abstract
Decompilation converts machine code into human-readable form, enabling analysis and debugging without source code. However, fidelity issues often degrade the readability and semantic accuracy of decompiled output. Existing methods, such as variable renaming or structural simplification, provide partial improvements but lack robust detection and correction, particularly for complex closed-source binaries. We present FidelityGPT, a framework that enhances decompiled code accuracy and readability by systematically detecting and correcting semantic distortions. FidelityGPT introduces distortion-aware prompt templates tailored to closed-source settings and integrates Retrieval-Augmented Generation (RAG) with a dynamic semantic intensity algorithm to locate distorted lines and retrieve semantically similar code from a database. A variable dependency algorithm further mitigates long-context…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
