Applying CodeBERT for Automated Program Repair of Java Simple Bugs

Ehsan Mashhadi; Hadi Hemmati

arXiv:2103.11626·cs.SE·April 1, 2021

Applying CodeBERT for Automated Program Repair of Java Simple Bugs

Ehsan Mashhadi, Hadi Hemmati

PDF

1 Repo

TL;DR

This paper introduces a novel automated program repair method using CodeBERT, a transformer-based model, to fix Java bugs efficiently and accurately, demonstrating promising results across multiple datasets.

Contribution

The study presents a new approach leveraging CodeBERT for automatic Java bug fixing, capable of handling varied bug types and fix lengths with high accuracy.

Findings

01

Predicts fixed codes with 19-72% accuracy depending on dataset

02

Generates varied-length fixes for different bug types

03

Fixes bugs in less than a second per case

Abstract

Software debugging, and program repair are among the most time-consuming and labor-intensive tasks in software engineering that would benefit a lot from automation. In this paper, we propose a novel automated program repair approach based on CodeBERT, which is a transformer-based neural architecture pre-trained on large corpus of source code. We fine-tune our model on the ManySStuBs4J small and large datasets to automatically generate the fix codes. The results show that our technique accurately predicts the fixed codes implemented by the developers in 19-72% of the cases, depending on the type of datasets, in less than a second per bug. We also observe that our method can generate varied-length fixes (short and long) and can fix different types of bugs, even if only a few instances of those types of bugs exist in the training dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

EhsanMashhadi/MSR2021-ProgramRepair
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.