Multi-relational Instruction Association Graph for Cross-architecture Binary Similarity Comparison
Qige Song, Yongzheng Zhang, Shuhao Li

TL;DR
This paper introduces a novel method for cross-architecture binary similarity comparison using a multi-relational instruction association graph and R-GCN, improving accuracy without external pre-training.
Contribution
It proposes a new approach that models instruction relationships across architectures with a relational graph convolutional network, avoiding external pre-training.
Findings
Outperforms existing methods on basic block and function-level datasets.
Effective in identifying malware across diverse IoT device architectures.
Bridges semantic gaps in cross-architecture instruction representations.
Abstract
Cross-architecture binary similarity comparison is essential in many security applications. Recently, researchers have proposed learning-based approaches to improve comparison performance. They adopted a paradigm of instruction pre-training, individual binary encoding, and distance-based similarity comparison. However, instruction embeddings pre-trained on external code corpus are not universal in diverse real-world applications. And separately encoding cross-architecture binaries will accumulate the semantic gap of instruction sets, limiting the comparison accuracy. This paper proposes a novel cross-architecture binary similarity comparison approach with multi-relational instruction association graph. We associate mono-architecture instruction tokens with context relevance and cross-architecture tokens with potential semantic correlations from different perspectives. Then we exploit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Network Security and Intrusion Detection · Anomaly Detection Techniques and Applications
