Branching Processes for QuickCheck Generators
Agust\'in Mista, Alejandro Russo, John Hughes

TL;DR
This paper introduces a mathematical approach using branching processes to predict and control the distribution of generated data in QuickCheck, enabling more accurate and user-aligned random testing for complex algebraic data types.
Contribution
It adapts branching process theory to predict generator behavior and develops heuristics for automatic probability adjustment, improving QuickCheck generator distribution control.
Findings
Improved code coverage with synthesized generators
Effective prediction of constructor counts in recursive ADTs
Enhanced alignment of generator distributions with user demands
Abstract
In QuickCheck (or, more generally, random testing), it is challenging to control random data generators' distributions---specially when it comes to user-defined algebraic data types (ADT). In this paper, we adapt results from an area of mathematics known as branching processes, and show how they help to analytically predict (at compile-time) the expected number of generated constructors, even in the presence of mutually recursive or composite ADTs. Using our probabilistic formulas, we design heuristics capable of automatically adjusting probabilities in order to synthesize generators which distributions are aligned with users' demands. We provide a Haskell implementation of our mechanism in a tool called DRaGeN and perform case studies with real-world applications. When generating random values, our synthesized QuickCheck generators show improvements in code coverage when compared with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
