Self-Supervised Bug Detection and Repair

Miltiadis Allamanis; Henry Jackson-Flux; Marc Brockschmidt

arXiv:2105.12787·cs.LG·November 17, 2021·38 cites

Self-Supervised Bug Detection and Repair

Miltiadis Allamanis, Henry Jackson-Flux, Marc Brockschmidt

PDF

Open Access 1 Repo 1 Datasets 1 Video

TL;DR

BugLab is a self-supervised learning approach that trains models to detect and repair bugs in code without large annotated datasets, significantly improving bug detection accuracy and discovering new bugs.

Contribution

Introduces BugLab, a novel self-supervised framework for bug detection and repair that co-trains detection and bug generation models without needing large labeled datasets.

Findings

01

Improves bug detection accuracy by up to 30% over baselines

02

Finds 19 previously unknown bugs in open-source software

03

Effective on a dataset of 2374 real-life bugs

Abstract

Machine learning-based program analyses have recently shown the promise of integrating formal and probabilistic reasoning towards aiding software development. However, in the absence of large annotated corpora, training these analyses is challenging. Towards addressing this, we present BugLab, an approach for self-supervised learning of bug detection and repair. BugLab co-trains two models: (1) a detector model that learns to detect and repair bugs in code, (2) a selector model that learns to create buggy code for the detector to use as training data. A Python implementation of BugLab improves by up to 30% upon baseline methods on a test dataset of 2374 real-life bugs and finds 19 previously unknown bugs in open-source software.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

microsoft/neurips21-self-supervised-bug-detection-and-repair
pytorchOfficial

Datasets

Nadav-Timor/PyPiBugs
dataset· 20 dl
20 dl

Videos

Self-Supervised Bug Detection and Repair· slideslive

Taxonomy

TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Software Reliability and Analysis Research

MethodsRepair