ORCAS: Obfuscation-Resilient Binary Code Similarity Analysis using Dominance Enhanced Semantic Graph
Yufeng Wang, Yuhong Feng, Yixuan Cao, Haoran Li, Haiyue Feng, Yifeng Wang

TL;DR
ORCAS introduces a novel obfuscation-resilient binary code similarity analysis method using dominance-enhanced semantic graphs, significantly improving robustness against code obfuscation in binary analysis tasks.
Contribution
The paper presents ORCAS, a new binary code similarity model based on dominance-enhanced semantic graphs that effectively handles obfuscated code, outperforming existing approaches.
Findings
Achieves 12.1% PR-AUC improvement over state-of-the-art with obfuscation.
Outperforms existing methods with up to 43% recall increase on real-world vulnerability dataset.
Constructs and releases a new obfuscated vulnerability dataset for research.
Abstract
Binary code similarity analysis (BCSA) serves as a foundational technique for binary analysis tasks such as vulnerability detection and malware identification. Existing graph based BCSA approaches capture more binary code semantics and demonstrate remarkable performance. However, when code obfuscation is applied, the unstable control flow structure degrades their performance. To address this issue, we develop ORCAS, an Obfuscation-Resilient BCSA model based on Dominance Enhanced Semantic Graph (DESG). The DESG is an original binary code representation, capturing more binaries' implicit semantics without control flow structure, including inter-instruction relations (e.g., def-use), inter-basic block relations (i.e., dominance and post-dominance), and instruction-basic block relations. ORCAS takes binary functions from different obfuscation options, optimization levels, and instruction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
