PyResBugs: A Dataset of Residual Python Bugs for Natural Language-Driven Fault Injection
Domenico Cotroneo, Giuseppe De Rosa, Pietro Liguori

TL;DR
PyResBugs is a new dataset of residual Python bugs paired with natural language descriptions, enabling natural language-driven fault injection to improve AI-based testing of Python software.
Contribution
The paper introduces PyResBugs, a novel dataset linking residual bugs with natural language descriptions for fault injection in Python systems.
Findings
Provides a high-quality dataset for fault injection research
Enables natural language-driven fault simulation in Python
Bridges gap between fault injection techniques and real-world bugs
Abstract
This paper presents PyResBugs, a curated dataset of residual bugs, i.e., defects that persist undetected during traditional testing but later surface in production, collected from major Python frameworks. Each bug in the dataset is paired with its corresponding fault-free (fixed) version and annotated with multi-level natural language (NL) descriptions. These NL descriptions enable natural language-driven fault injection, offering a novel approach to simulating real-world faults in software systems. By bridging the gap between software fault injection techniques and real-world representativeness, PyResBugs provides researchers with a high-quality resource for advancing AI-driven automated testing in Python systems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Software Engineering Research · Software Reliability and Analysis Research
