LLAMAFUZZ: Large Language Model Enhanced Greybox Fuzzing
Hongxiang Zhang, Yuyang Rong, Yifeng He, Hao Chen

TL;DR
LLAMAFUZZ leverages large language models to enhance greybox fuzzing, especially for structured data, leading to significant improvements in bug detection and code coverage over traditional methods.
Contribution
This paper introduces LLAMAFUZZ, a novel LLM-based greybox fuzzing approach that effectively handles structured data and improves bug discovery and coverage.
Findings
Outperforms top competitors by 41 bugs on average.
Identified 47 unique bugs across experiments.
Achieves 27.19% more branches than AFL++.
Abstract
Greybox fuzzing has achieved success in revealing bugs and vulnerabilities in programs. However, randomized mutation strategies have limited the fuzzer's performance on structured data. Specialized fuzzers can handle complex structured data, but require additional efforts in grammar and suffer from low throughput. In this paper, we explore the potential of utilizing the Large Language Model to enhance greybox fuzzing for structured data. We utilize the pre-trained knowledge of LLM about data conversion and format to generate new valid inputs. We further fine-tuned it with paired mutation seeds to learn structured format and mutation strategies effectively. Our LLM-based fuzzer, LLAMAFUZZ, integrates the power of LLM to understand and mutate structured data to fuzzing. We conduct experiments on the standard bug-based benchmark Magma and a wide variety of real-world programs. LLAMAFUZZ…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
