XSearch: Explainable Code Search via Concept-to-Code Alignment

Yiming Liu; Ruofan Liu; Yun Lin; Zicong Zhang; Weiyu Kong; Pengnian Qi; Xiao Cheng; Weinan Zhang; Qianxiang Wang; Linpeng Huang

arXiv:2605.16046·cs.SE·May 18, 2026

XSearch: Explainable Code Search via Concept-to-Code Alignment

Yiming Liu, Ruofan Liu, Yun Lin, Zicong Zhang, Weiyu Kong, Pengnian Qi, Xiao Cheng, Weinan Zhang, Qianxiang Wang, Linpeng Huang

PDF

TL;DR

XSearch introduces an explainable code search framework that aligns query concepts with code statements, improving out-of-distribution performance and providing inherent explanations.

Contribution

It reformulates code search as a concept alignment problem, enabling explainability and better generalization compared to traditional embedding-based methods.

Findings

01

XSearch improves out-of-distribution benchmark performance from 0.02 to 0.33.

02

Concept-alignment explanations help users evaluate results faster and more accurately.

03

Outperforms state-of-the-art retrievers with up to 7B parameters.

Abstract

Semantic code search has been widely adopted in both academia and industry. These approaches embed natural-language queries and code snippets into a shared embedding space and retrieve results based on vector similarity. Despit strong performance on benchmark datasets, they often suffer from poor explainability and generalization. Retrieved code may appear semantically similar yet miss critical functional requirements of the query, while providing no explanation of why the result was retrieved. Moreover, such failures become more severe under distribution shift, where models struggle to generalize to unseen benchmarks. In this work, we propose XSearch, an intrinsically explainable code search framework. Our key insight is that by relying on global embedding similarity, existing retrievers inherently take an inductive view. They learn statistical patterns rather than truly understanding…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.