SweRank: Software Issue Localization with Code Ranking

Revanth Gangi Reddy; Tarun Suresh; JaeHyeok Doo; Ye Liu; Xuan Phi Nguyen; Yingbo Zhou; Semih Yavuz; Caiming Xiong; Heng Ji; Shafiq Joty

arXiv:2505.07849·cs.SE·April 23, 2026

SweRank: Software Issue Localization with Code Ranking

Revanth Gangi Reddy, Tarun Suresh, JaeHyeok Doo, Ye Liu, Xuan Phi Nguyen, Yingbo Zhou, Semih Yavuz, Caiming Xiong, Heng Ji, Shafiq Joty

PDF

1 Video

TL;DR

SweRank is a new retrieve-and-rerank framework for software issue localization that outperforms existing models and costly LLM-based approaches, using a large-scale dataset called SweLoc.

Contribution

The paper introduces SweRank, an efficient framework for issue localization, and SweLoc, a large dataset for training and evaluating such models, achieving state-of-the-art results.

Findings

01

SweRank outperforms prior ranking models and LLM-based systems on benchmark datasets.

02

SweLoc enables effective training of issue localization models with real-world data.

03

SweRank improves the utility of existing retriever and reranker models for issue localization.

Abstract

Software issue localization, the task of identifying the precise code locations (files, classes, or functions) relevant to a natural language issue description (e.g., bug report, feature request), is a critical yet time-consuming aspect of software development. While recent LLM-based agentic approaches demonstrate promise, they often incur significant latency and cost due to complex multi-step reasoning and relying on closed-source LLMs. Alternatively, traditional code ranking models, typically optimized for query-to-code or code-to-code retrieval, struggle with the verbose and failure-descriptive nature of issue localization queries. To bridge this gap, we introduce SweRank, an efficient and effective retrieve-and-rerank framework for software issue localization. To facilitate training, we construct SweLoc, a large-scale dataset curated from public GitHub repositories, featuring…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

SWERank: Software Issue Localization with Code Ranking· slideslive