Detect-Localize-Repair: A Unified Framework for Learning to Debug with CodeT5
Nghi D. Q. Bui, Yue Wang, Steven Hoi

TL;DR
This paper introduces CodeT5-DLR, a unified framework that leverages a pretrained language model to detect, localize, and repair bugs in code, demonstrating superior performance on Java and Python debugging datasets.
Contribution
It presents the first unified model for bug detection, localization, and repair, integrating these tasks into a single framework based on CodeT5.
Findings
Outperforms existing baselines in bug detection, localization, and repair.
Effective on datasets in both Java and Python.
Demonstrates the mutual benefits of joint debugging tasks.
Abstract
Automated software debugging is a crucial task for improving the productivity of software developers. Many neural-based techniques have been proven effective for debugging-related tasks such as bug localization and program repair (or bug fixing). However, these techniques often focus only on either one of them or approach them in a stage-wise manner, ignoring the mutual benefits between them. In this work, we propose a novel unified \emph{Detect-Localize-Repair} framework based on a pretrained programming language model CodeT5 to seamlessly address these tasks, named CodeT5-DLR. Specifically, we propose three objectives to adapt the generic CodeT5 for debugging: a bug detection objective to determine whether a given code snippet is buggy or not, a bug localization objective to identify the buggy lines, and a program repair objective to translate the buggy code to its fixed version. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Software System Performance and Reliability
MethodsMulti-Head Attention · Attention Is All You Need · Repair · Linear Layer · Residual Connection · Dense Connections · Layer Normalization · Softmax · Dropout · Byte Pair Encoding
