SequenceR: Sequence-to-Sequence Learning for End-to-End Program Repair

Zimin Chen; Steve Kommrusch; Michele Tufano; Louis-No\"el; Pouchet; Denys Poshyvanyk; Martin Monperrus

arXiv:1901.01808·cs.SE·September 12, 2019

SequenceR: Sequence-to-Sequence Learning for End-to-End Program Repair

Zimin Chen, Steve Kommrusch, Michele Tufano, Louis-No\"el, Pouchet, Denys Poshyvanyk, Martin Monperrus

PDF

2 Repos

TL;DR

SequenceR introduces a sequence-to-sequence learning system for automatic program repair, effectively fixing bugs by leveraging a copy mechanism to handle large code vocabularies, trained on extensive open-source data.

Contribution

It is the first end-to-end neural approach for program repair that captures diverse repair operators without domain-specific design, using a large curated dataset.

Findings

01

Perfectly predicts fixed lines for 950 out of 4711 samples

02

Finds correct patches for 14 bugs in Defects4J

03

Demonstrates effectiveness across real bug fixes and benchmarks

Abstract

This paper presents a novel end-to-end approach to program repair based on sequence-to-sequence learning. We devise, implement, and evaluate a system, called SequenceR, for fixing bugs based on sequence-to-sequence learning on source code. This approach uses the copy mechanism to overcome the unlimited vocabulary problem that occurs with big code. Our system is data-driven; we train it on 35,578 samples, carefully curated from commits to open-source repositories. We evaluate it on 4,711 independent real bug fixes, as well on the Defects4J benchmark used in program repair research. SequenceR is able to perfectly predict the fixed line for 950/4711 testing samples, and find correct patches for 14 bugs in Defects4J. It captures a wide range of repair operators without any domain-specific top-down design.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsRepair