TL;DR
CURE introduces a code-aware neural machine translation approach for automatic program repair, leveraging pre-training, a specialized search strategy, and subword tokenization to improve fix accuracy over existing methods.
Contribution
CURE's novel combination of pre-trained programming language models, code-aware search, and subword tokenization advances neural APR beyond prior template-based and NMT approaches.
Findings
Correctly fixed 57 Defects4J bugs
Fixed 26 QuixBugs bugs
Outperformed all existing APR techniques on benchmarks
Abstract
Automatic program repair (APR) is crucial to improve software reliability. Recently, neural machine translation (NMT) techniques have been used to fix software bugs automatically. While promising, these approaches have two major limitations. Their search space often does not contain the correct fix, and their search strategy ignores software knowledge such as strict code syntax. Due to these limitations, existing NMT-based techniques underperform the best template-based approaches. We propose CURE, a new NMT-based APR technique with three major novelties. First, CURE pre-trains a programming language (PL) model on a large software codebase to learn developer-like source code before the APR task. Second, CURE designs a new code-aware search strategy that finds more correct fixes by focusing on compilable patches and patches that are close in length to the buggy code. Finally, CURE uses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRepair
