Fool Me If You Can: On the Robustness of Binary Code Similarity Detection Models against Semantics-preserving Transformations
Jiyong Uhm, Minseok Kim, Michalis Polychronakis, Hyungjoon Koo

TL;DR
This paper evaluates the robustness of deep learning models for binary code similarity detection against semantics-preserving transformations, revealing vulnerabilities and factors influencing model resilience.
Contribution
It introduces asmFooler, a system for assessing BCSD model robustness, and constructs a large dataset with diverse adversarial transformations to analyze model vulnerabilities.
Findings
Model robustness depends on processing pipeline components.
Transformation effectiveness is limited by model-specific constraints.
Well-designed transformations can significantly mislead models with minimal changes.
Abstract
Binary code analysis plays an essential role in cybersecurity, facilitating reverse engineering to reveal the inner workings of programs in the absence of source code. Traditional approaches, such as static and dynamic analysis, extract valuable insights from stripped binaries, but often demand substantial expertise and manual effort. Recent advances in deep learning have opened promising opportunities to enhance binary analysis by capturing latent features and disclosing underlying code semantics. Despite the growing number of binary analysis models based on machine learning, their robustness to adversarial code transformations at the binary level remains underexplored. We evaluate the robustness of deep learning models for the task of binary code similarity detection (BCSD) under semantics-preserving transformations. The unique nature of machine instructions presents distinct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Advanced Malware Detection Techniques · Software Testing and Debugging Techniques
