TL;DR
CoDe-R is a two-stage framework that enhances decompiler output using rationale guidance and adaptive inference, significantly improving re-executability and semantic accuracy of generated code.
Contribution
It introduces a novel semantic injection strategy and a dynamic fallback mechanism, achieving state-of-the-art results with a lightweight 1.3B model.
Findings
CoDe-R surpasses previous models in re-executability rate.
The framework effectively recovers high-level algorithmic intent.
Achieves state-of-the-art performance on HumanEval-Decompile benchmark.
Abstract
Binary decompilation is a critical reverse engineering task aimed at reconstructing high-level source code from stripped executables. Although Large Language Models (LLMs) have recently shown promise, they often suffer from "logical hallucinations" and "semantic misalignment" due to the irreversible semantic loss during compilation, resulting in generated code that fails to re-execute. In this study, we propose Cognitive Decompiler Refinement with Robustness (CoDe-R), a lightweight two-stage code refinement framework. The first stage introduces Semantic Cognitive Enhancement (SCE), a Rationale-Guided Semantic Injection strategy that trains the model to recover high-level algorithmic intent alongside code. The second stage introduces a Dynamic Dual-Path Fallback (DDPF) mechanism during inference, which adaptively balances semantic recovery and syntactic stability via a hybrid…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
