Fast, Fine-Grained Equivalence Checking for Neural Decompilers

Luke Dramko; Claire Le Goues; Edward J. Schwartz

arXiv:2501.04811·cs.LG·January 10, 2025

Fast, Fine-Grained Equivalence Checking for Neural Decompilers

Luke Dramko, Claire Le Goues, Edward J. Schwartz

PDF

Open Access

TL;DR

This paper introduces codealign, an instruction-level code equivalence method for neural decompilers, providing detailed evaluation metrics that improve upon existing similarity measures.

Contribution

The paper presents codealign, a novel formal method for instruction-level equivalence checking tailored for neural decompilers, enhancing evaluation accuracy.

Findings

01

codealign generates detailed equivalence alignments

02

It outperforms symbolic execution in evaluation tasks

03

Provides more granular insights into decompiler correctness

Abstract

Neural decompilers are machine learning models that reconstruct the source code from an executable program. Critical to the lifecycle of any machine learning model is an evaluation of its effectiveness. However, existing techniques for evaluating neural decompilation models have substantial weaknesses, especially when it comes to showing the correctness of the neural decompiler's predictions. To address this, we introduce codealign, a novel instruction-level code equivalence technique designed for neural decompilers. We provide a formal definition of a relation between equivalent instructions, which we term an equivalence alignment. We show how codealign generates equivalence alignments, then evaluate codealign by comparing it with symbolic execution. Finally, we show how the information codealign provides-which parts of the functions are equivalent and how well the variable names…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFerroelectric and Negative Capacitance Devices · Adversarial Robustness in Machine Learning · Neural Networks and Applications