GraphBinMatch: Graph-based Similarity Learning for Cross-Language Binary and Source Code Matching
Ali TehraniJamsaz, Hanze Chen, Ali Jannesari

TL;DR
GraphBinMatch introduces a graph neural network approach for cross-language binary and source code matching, significantly improving accuracy over existing methods in various code matching tasks.
Contribution
It presents a novel graph neural network-based method for cross-language binary-source code matching, addressing a key challenge in multi-language software analysis.
Findings
Outperforms state-of-the-art by up to 15% F1 score
Effective in cross-language binary-to-source matching
Also performs well in source-to-source matching
Abstract
Matching binary to source code and vice versa has various applications in different fields, such as computer security, software engineering, and reverse engineering. Even though there exist methods that try to match source code with binary code to accelerate the reverse engineering process, most of them are designed to focus on one programming language. However, in real life, programs are developed using different programming languages depending on their requirements. Thus, cross-language binary-to-source code matching has recently gained more attention. Nonetheless, the existing approaches still struggle to have precise predictions due to the inherent difficulties when the problem of matching binary code and source code needs to be addressed across programming languages. In this paper, we address the problem of cross-language binary source code matching. We propose GraphBinMatch, an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Advanced Malware Detection Techniques
