Enhancing Security Patch Identification by Capturing Structures in Commits
Bozhi Wu, Shangqing Liu, Ruitao Feng, Xiaofei Xie, Jingkai Siow, and, Shang-Wei Lin

TL;DR
This paper introduces E-SPI, a novel approach for security patch identification that captures structural information in commits using code and message encoders, significantly outperforming existing methods.
Contribution
E-SPI effectively extracts structural information from commits with code and message encoders, improving security patch identification accuracy over prior flat-sequence models.
Findings
E-SPI outperforms six state-of-the-art approaches in experiments.
Structural information improves patch identification accuracy.
Approach validated on real deployment data.
Abstract
With the rapid increasing number of open source software (OSS), the majority of the software vulnerabilities in the open source components are fixed silently, which leads to the deployed software that integrated them being unable to get a timely update. Hence, it is critical to design a security patch identification system to ensure the security of the utilized software. However, most of the existing works for security patch identification just consider the changed code and the commit message of a commit as a flat sequence of tokens with simple neural networks to learn its semantics, while the structure information is ignored. To address these limitations, in this paper, we propose our well-designed approach E-SPI, which extracts the structure information hidden in a commit for effective identification. Specifically, it consists of the code change encoder to extract the syntactic of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Advanced Malware Detection Techniques · Software System Performance and Reliability
