Self-Supervised Bug Detection and Repair
Miltiadis Allamanis, Henry Jackson-Flux, Marc Brockschmidt

TL;DR
BugLab is a self-supervised learning approach that trains models to detect and repair bugs in code without large annotated datasets, significantly improving bug detection accuracy and discovering new bugs.
Contribution
Introduces BugLab, a novel self-supervised framework for bug detection and repair that co-trains detection and bug generation models without needing large labeled datasets.
Findings
Improves bug detection accuracy by up to 30% over baselines
Finds 19 previously unknown bugs in open-source software
Effective on a dataset of 2374 real-life bugs
Abstract
Machine learning-based program analyses have recently shown the promise of integrating formal and probabilistic reasoning towards aiding software development. However, in the absence of large annotated corpora, training these analyses is challenging. Towards addressing this, we present BugLab, an approach for self-supervised learning of bug detection and repair. BugLab co-trains two models: (1) a detector model that learns to detect and repair bugs in code, (2) a selector model that learns to create buggy code for the detector to use as training data. A Python implementation of BugLab improves by up to 30% upon baseline methods on a test dataset of 2374 real-life bugs and finds 19 previously unknown bugs in open-source software.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Software Reliability and Analysis Research
MethodsRepair
