Efficient Black-Box Fault Localization for System-Level Test Code Using Large Language Models
Ahmadreza Saboor Yaraghi, Golnaz Gharachorlu, Sakina Fatima, Lionel C. Briand, Ruiyuan Wan, Ruifeng Gao

TL;DR
This paper presents a static, LLM-driven fault localization method for system-level test code that does not require executing the test case, reducing inference time and token usage while maintaining high accuracy.
Contribution
It introduces a novel, execution-free approach for fault localization in complex test scripts using large language models and failure logs.
Findings
Achieves around 90% F1 score in fault localization accuracy.
Reduces LLM inference time by up to 34%.
Uses 93% fewer tokens than previous LLM-guided methods.
Abstract
Fault localization (FL) is a critical step in debugging, which typically relies on repeated executions to pinpoint faulty code regions. However, repeated executions can be impractical in the presence of non-deterministic failures or high execution costs. While recent efforts have leveraged Large Language Models (LLMs) to aid execution-free FL, these have primarily focused on identifying faults in the system-under-test (SUT) rather than in the often complex system-level test code. However, the latter is also important, as in practice, many failures are triggered by faulty test code. To overcome these challenges, we introduce a fully static, LLM-driven approach for system-level test code fault localization (TCFL) that does not require executing the test case. Our method uses a single failure execution log to estimate the test's execution trace through three novel algorithms that identify…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
