RandSet: Randomized Corpus Reduction for Fuzzing Seed Scheduling
Yuchong Xie, Kaikai Zhang, Yu Liu, Rundong Yang, Ping Chen, Shuai Wang, Dongdong She

TL;DR
RandSet introduces a randomized corpus reduction method for fuzzing seed scheduling that effectively reduces seed explosion, enhances seed diversity, and maintains low overhead, leading to improved fuzzing performance.
Contribution
The paper presents RandSet, a novel randomized corpus reduction technique formulated as a set cover problem, which improves seed diversity and reduces overhead in fuzzing seed scheduling.
Findings
Achieves significantly more diverse seed selection than existing techniques.
Provides up to 16.58% coverage gain on standalone programs.
Triggers up to 7 more bugs than state-of-the-art methods.
Abstract
Seed explosion is a fundamental problem in fuzzing seed scheduling, where a fuzzer maintains a huge corpus and fails to choose promising seeds. Existing works focus on seed prioritization but still suffer from seed explosion since corpus size remains huge. We tackle this from a new perspective: corpus reduction, i.e., computing a seed corpus subset. However, corpus reduction could lead to poor seed diversity and large runtime overhead. Prior techniques like cull_queue, AFL-Cmin, and MinSet suffer from poor diversity or prohibitive overhead, making them unsuitable for high-frequency seed scheduling. We propose RandSet, a novel randomized corpus reduction technique that reduces corpus size and yields diverse seed selection simultaneously with minimal overhead. Our key insight is introducing randomness into corpus reduction to enjoy two benefits of a randomized algorithm: randomized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Parallel Computing and Optimization Techniques · Software Engineering Research
