Automatic Test-Case Reduction in Proof Assistants: A Case Study in Coq

Jason Gross (CSAIL; MIRI); Th\'eo Zimmermann (PI.R2; IRIF; (UMR\_8243)); Rajashree Agrawal; Adam Chlipala (CSAIL)

arXiv:2202.13823·cs.SE·March 12, 2025·1 cites

Automatic Test-Case Reduction in Proof Assistants: A Case Study in Coq

Jason Gross (CSAIL, MIRI), Th\'eo Zimmermann (PI.R2, IRIF, (UMR\_8243)), Rajashree Agrawal, Adam Chlipala (CSAIL)

PDF

Open Access 1 Repo

TL;DR

This paper introduces the Coq Bug Minimizer, a tool that automates minimal test-case reduction for proof assistant bugs, significantly improving debugging efficiency and reproducibility in Coq development.

Contribution

The paper presents a novel test-case reduction tool for Coq, with insights into how reduction differs from traditional compilers, and demonstrates its effectiveness on real CI failures.

Findings

01

Reduces test cases in 75% of failures within a practical time frame.

02

Produces standalone test cases in 89% of cases.

03

Reduces test size to about one-third of the original.

Abstract

As the adoption of proof assistants increases, there is a need for efficiency in identifying, documenting, and fixing compatibility issues that arise from proof assistant evolution. We present the Coq Bug Minimizer, a tool for reproducing buggy behavior with minimal and standalone files, integrated with coqbot to trigger automatically on Coq reverse CI failures. Our tool eliminates the overhead of having to download, set up, compile, and then explore and understand large developments: enabling Coq developers to easily obtain modular test-case files for fast experimentation. In this paper, we describe insights about how test-case reduction is different in Coq than in traditional compilers. We expect that our insights will generalize to other proof assistants. We evaluate the Coq Bug Minimizer on over 150 CI failures. Our tool succeeds in reducing failures to smaller test cases in roughly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jasongross/coq-tools
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Testing and Debugging Techniques · Software Engineering Research · Software System Performance and Reliability