LLM-Enhanced Software Patch Localization
Jinhong Yu, Yi Chen, Di Tang, Xiaozhong Liu, XiaoFeng Wang, Chen Wu,, Haixu Tang

TL;DR
This paper introduces LLM-SPL, a novel approach that uses large language models to improve security patch localization in open source software, especially for complex cases with multiple patches, reducing manual effort and increasing ranking accuracy.
Contribution
The paper proposes a joint learning framework integrating LLM outputs as features to enhance security patch recommendation, addressing limitations of previous models in CVE association and multiple patch scenarios.
Findings
LLM-SPL outperforms state-of-the-art in patch ranking accuracy.
Significantly improves recall and NDCG for multi-patch vulnerabilities.
Reduces manual effort by over 25% in patch identification.
Abstract
Open source software (OSS) is integral to modern product development, and any vulnerability within it potentially compromises numerous products. While developers strive to apply security patches, pinpointing these patches among extensive OSS updates remains a challenge. Security patch localization (SPL) recommendation methods are leading approaches to address this. However, existing SPL models often falter when a commit lacks a clear association with its corresponding CVE, and do not consider a scenario that a vulnerability has multiple patches proposed over time before it has been fully resolved. To address these challenges, we introduce LLM-SPL, a recommendation-based SPL approach that leverages the capabilities of the Large Language Model (LLM) to locate the security patch commit for a given CVE. More specifically, we propose a joint learning framework, in which the outputs of LLM…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Software Engineering Research
