Automated Program Repair: Emerging trends pose and expose problems for   benchmarks

Joseph Renzullo; Pemma Reiter; Westley Weimer; Stephanie Forrest

arXiv:2405.05455·cs.SE·May 10, 2024·1 cites

Automated Program Repair: Emerging trends pose and expose problems for benchmarks

Joseph Renzullo, Pemma Reiter, Westley Weimer, Stephanie Forrest

PDF

Open Access

TL;DR

This paper discusses how emerging machine learning techniques, especially large language models, are transforming automated program repair and highlights the challenges in evaluating these new approaches using existing benchmarks.

Contribution

It identifies the mismatch between current APR benchmarks and ML-based methods, emphasizing the need for better evaluation practices for LLM-driven repair techniques.

Findings

01

Existing benchmarks may be biased due to LLM training data overlap

02

ML-based APR methods are rapidly evolving and require new evaluation standards

03

Challenges in ensuring the validity and generalizability of results with ML techniques

Abstract

Machine learning (ML) now pervades the field of Automated Program Repair (APR). Algorithms deploy neural machine translation and large language models (LLMs) to generate software patches, among other tasks. But, there are important differences between these applications of ML and earlier work. Evaluations and comparisons must take care to ensure that results are valid and likely to generalize. A challenge is that the most popular APR evaluation benchmarks were not designed with ML techniques in mind. This is especially true for LLMs, whose large and often poorly-disclosed training datasets may include problems on which they are evaluated.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Storage Technologies · Parallel Computing and Optimization Techniques · Radiation Effects in Electronics