TL;DR
ProphetFuzz leverages large language models with prompt engineering to automatically predict high-risk option combinations and efficiently fuzz test software, significantly improving vulnerability detection over existing methods.
Contribution
The paper introduces ProphetFuzz, a novel tool that uses LLMs and prompt engineering to predict and fuzz high-risk option combinations without human input, enhancing testing efficiency.
Findings
Successfully predicted 1748 high-risk combinations at low cost
Discovered 364 vulnerabilities after 72 hours of fuzzing, outperforming state-of-the-art
Uncovered 140 vulnerabilities in latest program versions, with many confirmed and assigned CVEs
Abstract
Vulnerabilities related to option combinations pose a significant challenge in software security testing due to their vast search space. Previous research primarily addressed this challenge through mutation or filtering techniques, which inefficiently treated all option combinations as having equal potential for vulnerabilities, thus wasting considerable time on non-vulnerable targets and resulting in low testing efficiency. In this paper, we utilize carefully designed prompt engineering to drive the large language model (LLM) to predict high-risk option combinations (i.e., more likely to contain vulnerabilities) and perform fuzz testing automatically without human intervention. We developed a tool called ProphetFuzz and evaluated it on a dataset comprising 52 programs collected from three related studies. The entire experiment consumed 10.44 CPU years. ProphetFuzz successfully…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
