Fast, Fine-Grained Equivalence Checking for Neural Decompilers
Luke Dramko, Claire Le Goues, Edward J. Schwartz

TL;DR
This paper introduces codealign, an instruction-level code equivalence method for neural decompilers, providing detailed evaluation metrics that improve upon existing similarity measures.
Contribution
The paper presents codealign, a novel formal method for instruction-level equivalence checking tailored for neural decompilers, enhancing evaluation accuracy.
Findings
codealign generates detailed equivalence alignments
It outperforms symbolic execution in evaluation tasks
Provides more granular insights into decompiler correctness
Abstract
Neural decompilers are machine learning models that reconstruct the source code from an executable program. Critical to the lifecycle of any machine learning model is an evaluation of its effectiveness. However, existing techniques for evaluating neural decompilation models have substantial weaknesses, especially when it comes to showing the correctness of the neural decompiler's predictions. To address this, we introduce codealign, a novel instruction-level code equivalence technique designed for neural decompilers. We provide a formal definition of a relation between equivalent instructions, which we term an equivalence alignment. We show how codealign generates equivalence alignments, then evaluate codealign by comparing it with symbolic execution. Finally, we show how the information codealign provides-which parts of the functions are equivalent and how well the variable names…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFerroelectric and Negative Capacitance Devices · Adversarial Robustness in Machine Learning · Neural Networks and Applications
