SieveFL: Hierarchical Runtime-Aware Pruning for Scalable LLM-Based Fault Localization

Mahdi Farzandway; Fatemeh Ghassemi

arXiv:2605.13491·cs.SE·May 14, 2026

SieveFL: Hierarchical Runtime-Aware Pruning for Scalable LLM-Based Fault Localization

Mahdi Farzandway, Fatemeh Ghassemi

PDF

TL;DR

SieveFL is a hierarchical framework that combines filtering, runtime analysis, and LLMs to improve fault localization accuracy and efficiency in large codebases, reducing candidate search space significantly.

Contribution

It introduces a novel five-stage hierarchical approach that integrates filtering, runtime data, and LLMs, achieving better accuracy and efficiency without proprietary models.

Findings

01

Top-1 accuracy of 41.8% on 395 bugs

02

Reduces candidate methods by 79% and token consumption by 49%

03

Improves ranking quality, especially Top-3 to Top-10

Abstract

Automated fault localization requires connecting an observed test failure to the responsible method across thousands of candidates--a task that purely statistical approaches handle with limited precision and that LLMs cannot yet handle at full project scale due to prohibitive token cost and signal dilution. We present SieveFL, a five-stage hierarchical framework that resolves this tension through aggressive pre-LLM filtering. SieveFL converts a failing test into a natural-language failure description, uses dense vector retrieval to narrow the search to a small set of suspicious files, and then eliminates any method not executed during the failing test via JaCoCo runtime traces. Only the surviving candidates are passed to the LLM, which screens each method individually and re-ranks the confirmed suspects in a single comparative pass. We evaluate SieveFL on 395 bugs from Defects4J v1.2.0…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.