CoIR: A Comprehensive Benchmark for Code Information Retrieval Models

Xiangyang Li; Kuicai Dong; Yi Quan Lee; Wei Xia; Hao Zhang; Xinyi Dai; Yasheng Wang; Ruiming Tang

arXiv:2407.02883·cs.IR·June 9, 2025

CoIR: A Comprehensive Benchmark for Code Information Retrieval Models

Xiangyang Li, Kuicai Dong, Yi Quan Lee, Wei Xia, Hao Zhang, Xinyi Dai, Yasheng Wang, Ruiming Tang

PDF

Open Access 1 Repo 3 Datasets 1 Video

TL;DR

COIR introduces a comprehensive benchmark with diverse datasets and tasks to evaluate and advance code information retrieval models, highlighting current challenges and facilitating research progress.

Contribution

The paper presents COIR, a new extensive benchmark for code retrieval, including datasets, evaluation framework, and analysis of existing models' performance.

Findings

01

State-of-the-art models face significant challenges in code retrieval tasks.

02

COIR's diverse datasets reveal gaps in current model capabilities.

03

The benchmark facilitates cross-domain and cross-task evaluation of code retrieval systems.

Abstract

Despite the substantial success of Information Retrieval (IR) in various NLP tasks, most IR systems predominantly handle queries and corpora in natural language, neglecting the domain of code retrieval. Code retrieval is critically important yet remains under-explored, with existing methods and benchmarks inadequately representing the diversity of code in various domains and tasks. Addressing this gap, we present COIR (Code Information Retrieval Benchmark), a robust and comprehensive benchmark specifically designed to assess code retrieval capabilities. COIR comprises ten meticulously curated code datasets, spanning eight distinctive retrieval tasks across seven diverse domains. We first discuss the construction of COIR and its diverse dataset composition. Further, we evaluate nine widely used retrieval models using COIR, uncovering significant difficulties in performing code retrieval…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

coir-team/coir
pytorchOfficial

Datasets

Videos

CoIR: A Comprehensive Benchmark for Code Information Retrieval Models· underline

Taxonomy

TopicsAdvanced Computational Techniques and Applications