Interpretation-enabled Software Reuse Detection Based on a Multi-Level Birthmark Model
Xi Xu, Qinghua Zheng, Zheng Yan, Ming Fan, Ang Jia, Ting Liu

TL;DR
This paper introduces ISRD, an interpretation-enabled method for detecting software reuse in binaries using a multi-level birthmark model, achieving high accuracy and robustness against obfuscation and cross-compilation.
Contribution
The paper presents a novel multi-level birthmark model and an intent search technique for efficient, accurate, and interpretable software reuse detection in binary code.
Findings
Achieves 97.2% precision and 94.8% recall in experiments.
Outperforms existing approaches in accuracy and robustness.
Resilient to cross-compilation obfuscation.
Abstract
Software reuse, especially partial reuse, poses legal and security threats to software development. Since its source codes are usually unavailable, software reuse is hard to be detected with interpretation. On the other hand, current approaches suffer from poor detection accuracy and efficiency, far from satisfying practical demands. To tackle these problems, in this paper, we propose \textit{ISRD}, an interpretation-enabled software reuse detection approach based on a multi-level birthmark model that contains function level, basic block level, and instruction level. To overcome obfuscation caused by cross-compilation, we represent function semantics with Minimum Branch Path (MBP) and perform normalization to extract core semantics of instructions. For efficiently detecting reused functions, a process for "intent search based on anchor recognition" is designed to speed up reuse…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Advanced Malware Detection Techniques · Digital and Cyber Forensics
