Multilevel Semantic Embedding of Software Patches: A Fine-to-Coarse Grained Approach Towards Security Patch Detection
Xunzhu Tang, zhenghan Chen, Saad Ezzini, Haoye Tian, Yewei, Song, Jacques Klein, Tegawende F. Bissyande

TL;DR
This paper introduces MultiSEM, a multilevel semantic embedding model that combines word-level, line-level, and description semantics to improve security patch detection in software, outperforming existing models.
Contribution
The paper presents a novel multilevel semantic embedding approach, MultiSEM, that integrates word, line, and description semantics for enhanced security patch detection.
Findings
Achieved 22.46% improvement on PatchDB in F1 score.
Achieved 9.21% improvement on SPI-DB in F1 score.
Demonstrated robustness and superiority over state-of-the-art models.
Abstract
The growth of open-source software has increased the risk of hidden vulnerabilities that can affect downstream software applications. This concern is further exacerbated by software vendors' practice of silently releasing security patches without explicit warnings or common vulnerability and exposure (CVE) notifications. This lack of transparency leaves users unaware of potential security threats, giving attackers an opportunity to take advantage of these vulnerabilities. In the complex landscape of software patches, grasping the nuanced semantics of a patch is vital for ensuring secure software maintenance. To address this challenge, we introduce a multilevel Semantic Embedder for security patch detection, termed MultiSEM. This model harnesses word-centric vectors at a fine-grained level, emphasizing the significance of individual words, while the coarse-grained layer adopts entire…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Advanced Malware Detection Techniques · Software System Performance and Reliability
