A Deep Dive into Large Language Models for Automated Bug Localization   and Repair

Soneya Binta Hossain; Nan Jiang; Qiang Zhou; Xiaopeng Li; Wen-Hao; Chiang; Yingjun Lyu; Hoan Nguyen; Omer Tripp

arXiv:2404.11595·cs.SE·May 13, 2024·1 cites

A Deep Dive into Large Language Models for Automated Bug Localization and Repair

Soneya Binta Hossain, Nan Jiang, Qiang Zhou, Xiaopeng Li, Wen-Hao, Chiang, Yingjun Lyu, Hoan Nguyen, Omer Tripp

PDF

Open Access

TL;DR

This paper introduces Toggle, a novel framework that separates bug localization and fixing using large language models, achieving state-of-the-art results in automated program repair benchmarks.

Contribution

It presents a new approach that employs different LLMs for bug localization and fixing, improving integration of contextual information and inductive biases.

Findings

01

Toggle achieves SOTA performance on CodeXGLUE benchmark.

02

It outperforms existing methods on several APR datasets.

03

Effective prompting strategies significantly enhance bug fixing results.

Abstract

Large language models (LLMs) have shown impressive effectiveness in various software engineering tasks, including automated program repair (APR). In this study, we take a deep dive into automated bug fixing utilizing LLMs. In contrast to many deep learning-based APR methods that assume known bug locations, rely on line-level localization tools, or address bug prediction and fixing in one step, our approach uniquely employs LLMs to predict bug location at the token level and subsequently utilizes them for bug fixing. This methodological separation of bug localization and fixing using different LLMs enables effective integration of diverse contextual information and improved incorporation of inductive biases. We introduce Toggle: Token-Granulated Bug Localization and Repair, a comprehensive program repair framework that integrates a bug localization model, an adjustment unit, and a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Testing and Debugging Techniques · Software Engineering Research · Natural Language Processing Techniques