Fault Localisation and Repair for DL Systems: An Empirical Study with LLMs

Jinhan Kim; Nargiz Humbatova; Gunel Jahangirova; Shin Yoo; Paolo Tonella

arXiv:2506.03396·cs.SE·June 5, 2025

Fault Localisation and Repair for DL Systems: An Empirical Study with LLMs

Jinhan Kim, Nargiz Humbatova, Gunel Jahangirova, Shin Yoo, Paolo Tonella

PDF

Open Access 1 Repo

TL;DR

This study evaluates existing fault localisation and repair techniques for deep learning models, introduces a novel LLM-based approach, and demonstrates GPT-4's significant improvements in these tasks on a new benchmark.

Contribution

It provides a comprehensive evaluation of current methods and introduces a novel LLM-based approach that significantly enhances fault localisation and repair in DL systems.

Findings

01

GPT-4 achieves 44% improvement in fault localisation

02

GPT-4 achieves 82% improvement in repair tasks

03

Current techniques have notable limitations in accuracy

Abstract

Numerous Fault Localisation (FL) and repair techniques have been proposed to address faults in Deep Learning (DL) models. However, their effectiveness in practical applications remains uncertain due to the reliance on pre-defined rules. This paper presents a comprehensive evaluation of state-of-the-art FL and repair techniques, examining their advantages and limitations. Moreover, we introduce a novel approach that harnesses the power of Large Language Models (LLMs) in localising and repairing DL faults. Our evaluation, conducted on a carefully designed benchmark, reveals the strengths and weaknesses of current FL and repair techniques. We emphasise the importance of enhanced accuracy and the need for more rigorous assessment methods that employ multiple ground truth patches. Notably, LLMs exhibit remarkable performance in both FL and repair tasks. For instance, the GPT-4 model achieves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

testingautomated-usi/dl-fl-repair
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Testing and Debugging Techniques · Radiation Effects in Electronics · Adversarial Robustness in Machine Learning