LIDL: LLM Integration Defect Localization via Knowledge Graph-Enhanced Multi-Agent Analysis

Gou Tan; Zilong He; Min Li; Pengfei Chen; Jieke Shi; Zhensu Sun; Ting Zhang; Danwen Chen; Lwin Khin Shar; Chuanfu Zhang; David Lo

arXiv:2601.05539·cs.SE·January 12, 2026

LIDL: LLM Integration Defect Localization via Knowledge Graph-Enhanced Multi-Agent Analysis

Gou Tan, Zilong He, Min Li, Pengfei Chen, Jieke Shi, Zhensu Sun, Ting Zhang, Danwen Chen, Lwin Khin Shar, Chuanfu Zhang, David Lo

PDF

Open Access

TL;DR

LIDL is a novel multi-agent framework that effectively localizes defects in LLM-integrated software by leveraging knowledge graphs, multi-source error evidence, and counterfactual reasoning, significantly outperforming existing methods.

Contribution

This paper introduces LIDL, a multi-agent defect localization approach specifically designed for LLM-integrated software, addressing the limitations of existing techniques in handling cross-layer dependencies and semantic reasoning.

Findings

01

LIDL achieves a Top-3 accuracy of 0.64, outperforming baselines.

02

LIDL reduces defect localization cost by 92.5%.

03

LIDL outperforms five state-of-the-art baselines across all metrics.

Abstract

LLM-integrated software, which embeds or interacts with large language models (LLMs) as functional components, exhibits probabilistic and context-dependent behaviors that fundamentally differ from those of traditional software. This shift introduces a new category of integration defects that arise not only from code errors but also from misaligned interactions among LLM-specific artifacts, including prompts, API calls, configurations, and model outputs. However, existing defect localization techniques are ineffective at identifying these LLM-specific integration defects because they fail to capture cross-layer dependencies across heterogeneous artifacts, cannot exploit incomplete or misleading error traces, and lack semantic reasoning capabilities for identifying root causes. To address these challenges, we propose LIDL, a multi-agent framework for defect localization in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Software System Performance and Reliability · Software Testing and Debugging Techniques