Interpretation-enabled Software Reuse Detection Based on a Multi-Level   Birthmark Model

Xi Xu; Qinghua Zheng; Zheng Yan; Ming Fan; Ang Jia; Ting Liu

arXiv:2103.10126·cs.SE·March 19, 2021

Interpretation-enabled Software Reuse Detection Based on a Multi-Level Birthmark Model

Xi Xu, Qinghua Zheng, Zheng Yan, Ming Fan, Ang Jia, Ting Liu

PDF

Open Access

TL;DR

This paper introduces ISRD, an interpretation-enabled method for detecting software reuse in binaries using a multi-level birthmark model, achieving high accuracy and robustness against obfuscation and cross-compilation.

Contribution

The paper presents a novel multi-level birthmark model and an intent search technique for efficient, accurate, and interpretable software reuse detection in binary code.

Findings

01

Achieves 97.2% precision and 94.8% recall in experiments.

02

Outperforms existing approaches in accuracy and robustness.

03

Resilient to cross-compilation obfuscation.

Abstract

Software reuse, especially partial reuse, poses legal and security threats to software development. Since its source codes are usually unavailable, software reuse is hard to be detected with interpretation. On the other hand, current approaches suffer from poor detection accuracy and efficiency, far from satisfying practical demands. To tackle these problems, in this paper, we propose \textit{ISRD}, an interpretation-enabled software reuse detection approach based on a multi-level birthmark model that contains function level, basic block level, and instruction level. To overcome obfuscation caused by cross-compilation, we represent function semantics with Minimum Branch Path (MBP) and perform normalization to extract core semantics of instructions. For efficiently detecting reused functions, a process for "intent search based on anchor recognition" is designed to speed up reuse…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Advanced Malware Detection Techniques · Digital and Cyber Forensics