LLAMAFUZZ: Large Language Model Enhanced Greybox Fuzzing

Hongxiang Zhang; Yuyang Rong; Yifeng He; Hao Chen

arXiv:2406.07714·cs.CR·March 18, 2026·1 cites

LLAMAFUZZ: Large Language Model Enhanced Greybox Fuzzing

Hongxiang Zhang, Yuyang Rong, Yifeng He, Hao Chen

PDF

Open Access

TL;DR

LLAMAFUZZ leverages large language models to enhance greybox fuzzing, especially for structured data, leading to significant improvements in bug detection and code coverage over traditional methods.

Contribution

This paper introduces LLAMAFUZZ, a novel LLM-based greybox fuzzing approach that effectively handles structured data and improves bug discovery and coverage.

Findings

01

Outperforms top competitors by 41 bugs on average.

02

Identified 47 unique bugs across experiments.

03

Achieves 27.19% more branches than AFL++.

Abstract

Greybox fuzzing has achieved success in revealing bugs and vulnerabilities in programs. However, randomized mutation strategies have limited the fuzzer's performance on structured data. Specialized fuzzers can handle complex structured data, but require additional efforts in grammar and suffer from low throughput. In this paper, we explore the potential of utilizing the Large Language Model to enhance greybox fuzzing for structured data. We utilize the pre-trained knowledge of LLM about data conversion and format to generate new valid inputs. We further fine-tuned it with paired mutation seeds to learn structured format and mutation strategies effectively. Our LLM-based fuzzer, LLAMAFUZZ, integrates the power of LLM to understand and mutate structured data to fuzzing. We conduct experiments on the standard bug-based benchmark Magma and a wide variety of real-world programs. LLAMAFUZZ…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling