
TL;DR
This paper introduces a novel binary code similarity detection method that uses symbolic execution and key instruction analysis to improve robustness against code mutations, with a prototype demonstrating promising results.
Contribution
It proposes a new approach focusing on key instructions and symbolic execution to enhance binary similarity detection, addressing limitations of existing methods.
Findings
Effective symbolic execution of binary code
Key instruction graph similarity correlates with code similarity
Prototype shows improved detection accuracy
Abstract
Binary code similarity detection is to detect the similarity of code at binary (assembly) level without source code. Existing works have their limitations when dealing with mutated binary code generated by different compiling options. In this paper, we propose a novel approach to addressing this problem. By inspecting the binary code, we found that generally, within a function, some instructions aim to calculate (prepare) values for other instructions. The latter instructions are defined by us as key instructions. Currently, we define four categories of key instructions: calling subfunctions, comparing instruction, returning instruction, and memory-store instruction. Thus if we symbolically execute similar binary codes, symbolic values at these key instructions are expected to be similar. As such, we implement a prototype tool, which has three steps. First, it symbolically executes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Advanced Malware Detection Techniques · Software Reliability and Analysis Research
