TL;DR
Epitome is a novel framework that improves binary function name prediction across diverse optimized binaries by leveraging votes-based tokenization, multi-task learning, and pre-trained models, significantly outperforming existing tools.
Contribution
The paper introduces Epitome, a new approach combining votes-based name tokenization and multi-task learning with pre-trained models to enhance function name prediction accuracy and generalizability.
Findings
Epitome achieves up to 44.34% improvement in precision.
Epitome outperforms state-of-the-art tools in recall and F1 score.
Epitome demonstrates strong generalizability across architectures and optimization levels.
Abstract
Reverse engineers would acquire valuable insights from descriptive function names, which are absent in publicly released binaries. Recent advances in binary function name prediction using data-driven machine learning show promise. However, existing approaches encounter difficulties in capturing function semantics in diverse optimized binaries and fail to reserve the meaning of labels in function names. We propose Epitome, a framework that enhances function name prediction using votes-based name tokenization and multi-task learning, specifically tailored for different compilation optimization binaries. Epitome learns comprehensive function semantics by pre-trained assembly language model and graph neural network, incorporating function semantics similarity prediction task, to maximize the similarity of function semantics in the context of different compilation optimization levels. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
