Neurosymbolic Repo-level Code Localization
Xiufeng Xu, Xiufeng Wu, Zejun Zhang, Yi Li

TL;DR
This paper introduces LogicLoc, a novel framework combining language models and logical reasoning for accurate, verifiable code localization that overcomes superficial keyword reliance and performs well on diagnostic and real-world benchmarks.
Contribution
It formalizes the challenge of keyword-agnostic code localization, creates a diagnostic benchmark, and proposes LogicLoc, integrating LLMs with logical reasoning for improved accuracy and efficiency.
Findings
LogicLoc outperforms state-of-the-art methods on KA-LogicQuery.
LogicLoc achieves better performance with lower token use and faster execution.
State-of-the-art models struggle with structural reasoning in code localization.
Abstract
Code localization is a cornerstone of autonomous software engineering. Recent advancements have achieved impressive performance on real-world issue benchmarks. However, we identify a critical yet overlooked bias: these benchmarks are saturated with keyword references (e.g. file paths, function names), encouraging models to rely on superficial lexical matching rather than genuine structural reasoning. We term this phenomenon the Keyword Shortcut. To address this, we formalize the challenge of Keyword-Agnostic Logical Code Localization (KA-LCL) and introduce KA-LogicQuery, a diagnostic benchmark requiring structural reasoning without any naming hints. Our evaluation reveals a catastrophic performance drop of state-of-the-art approaches on KA-LogicQuery, exposing their lack of deterministic reasoning capabilities. We propose LogicLoc, a novel agentic framework that combines large language…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
