ExplainFuzz: Explainable and Constraint-Conditioned Test Generation with Probabilistic Circuits
Anna\"elle Baiget, Jaron Maene, Seongmin Lee, Benjie Wang, Guy Van den Broeck, and Miryung Kim

TL;DR
ExplainFuzz is a novel test generation framework that uses Probabilistic Circuits to produce realistic, constrained, and explainable inputs for software testing, outperforming existing methods in coherence, diversity, and bug detection.
Contribution
It introduces a grammar-aware probabilistic circuit approach for interpretable, constraint-conditioned test input generation, improving over traditional grammar-based and language model methods.
Findings
Significantly reduces perplexity compared to pCFGs, PCs, and LLMs.
Increases bug-triggering rates in SQL and XML testing.
Enhances input diversity through native conditioning capabilities.
Abstract
Understanding and explaining the structure of generated test inputs is essential for effective software testing and debugging. Existing approaches--including grammar-based fuzzers, probabilistic Context-Free Grammars (pCFGs), and Large Language Models (LLMs)--suffer from critical limitations. They frequently produce ill-formed inputs that fail to reflect realistic data distributions, struggle to capture context-sensitive probabilistic dependencies, and lack explainability. We introduce ExplainFuzz, a test generation framework that leverages Probabilistic Circuits (PCs) to learn and query structured distributions over grammar-based test inputs interpretably and controllably. Starting from a Context-Free Grammar (CFG), ExplainFuzz compiles a grammar-aware PC and trains it on existing inputs. New inputs are then generated via sampling. ExplainFuzz utilizes the conditioning capability of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
