BLAZE: Cross-Language and Cross-Project Bug Localization via Dynamic Chunking and Hard Example Learning
Partha Chakraborty, Mahmoud Alfadel, Meiyappan Nagappan

TL;DR
BLAZE is a novel bug localization approach that leverages dynamic code chunking and hard example learning with GPT models to improve cross-language and cross-project bug detection, supported by a new large-scale dataset.
Contribution
The paper introduces BLAZE, a new method combining dynamic chunking and fine-tuning on challenging bugs, and provides the BEETLEBOX dataset for multi-language bug localization.
Findings
BLAZE outperforms six state-of-the-art baselines in accuracy metrics.
BLAZE achieves up to 120% increase in Top 1 accuracy.
The BEETLEBOX dataset covers 29 projects across five programming languages.
Abstract
Software bugs require developers to exert significant effort to identify and resolve them, often consuming about one-third of their time. Bug localization, the process of pinpointing the exact source code files that need modification, is crucial in reducing this effort. Existing bug localization tools, typically reliant on deep learning techniques, face limitations in cross-project applicability and effectiveness in multi-language environments. Recent advancements with Large Language Models (LLMs) offer detailed representations for bug localization. However, they encounter challenges with limited context windows and mapping accuracy. To address these issues, we propose BLAZE, an approach that employs dynamic chunking and hard example learning. First, BLAZE dynamically segments source code to minimize continuity loss. Then, BLAZE fine-tunes a GPT-based model using challenging bug cases,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Engineering Techniques and Practices
