CFG2VEC: Hierarchical Graph Neural Network for Cross-Architectural Software Reverse Engineering
Shih-Yuan Yu, Yonatan Gizachew Achamyeleh, Chonghan Wang, Anton, Kocheturov, Patrick Eisen, Mohammad Abdullah Al Faruque

TL;DR
CFG2VEC introduces a hierarchical GNN approach with a novel graph-of-graphs representation to improve cross-architecture binary analysis, significantly enhancing function name prediction accuracy and generalization in reverse engineering tasks.
Contribution
The paper presents cfg2vec, a hierarchical GNN with a graph-of-graphs representation for cross-architecture binary analysis, outperforming state-of-the-art methods and demonstrating practical applicability.
Findings
Outperforms state-of-the-art by 24.54% in function name prediction.
Achieves 51.84% improvement with more training data.
Generalizes well to unseen CPU architectures.
Abstract
Mission-critical embedded software is critical to our society's infrastructure but can be subject to new security vulnerabilities as technology advances. When security issues arise, Reverse Engineers (REs) use Software Reverse Engineering (SRE) tools to analyze vulnerable binaries. However, existing tools have limited support, and REs undergo a time-consuming, costly, and error-prone process that requires experience and expertise to understand the behaviors of software and vulnerabilities. To improve these tools, we propose , a Hierarchical Graph Neural Network (GNN) based approach. To represent binary, we propose a novel Graph-of-Graph (GoG) representation, combining the information of control-flow and function-call graphs. Our learns how to represent each binary function compiled from various CPU architectures, utilizing hierarchical GNN and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Engineering Techniques and Practices · Software System Performance and Reliability
