1-to-1 or 1-to-n? Investigating the effect of function inlining on binary similarity analysis
Ang Jia, Ming Fan, Wuxia Jin, Xi Xu, Zhaohui Zhou, Qiyi Tang, Sen Nie,, Shi Wu, Ting Liu

TL;DR
This paper investigates how function inlining affects binary similarity analysis, revealing significant performance drops in existing methods that neglect inlining, and proposing strategies to better simulate inlining effects.
Contribution
It highlights the impact of function inlining on binary similarity analysis and proposes new strategies for more accurate inlining simulation.
Findings
Function inlining can reach nearly 70% in binaries.
Existing methods suffer up to 40% performance loss due to inlining.
Current inlining-simulation strategies recover only 60% of inlined functions.
Abstract
Binary similarity analysis is critical to many code-reuse-related issues and "1-to-1" mechanism is widely applied, where one function in a binary file is matched against one function in a source file or binary file. However, we discover that function mapping is a more complex problem of "1-to-n" or even "n-to-n" due to the existence of function inlining. In this paper, we investigate the effect of function inlining on binary similarity analysis. We first construct 4 inlining-oriented datasets for four similarity analysis tasks, including code search, OSS reuse detection, vulnerability detection, and patch presence test. Then, we further study the extent of function inlining, the performance of existing works under function inlining, and the effectiveness of existing inlining-simulation strategies. Results show that the proportion of function inlining can reach nearly 70%, while most…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Software Engineering Research · Network Security and Intrusion Detection
