MIRA: An LLM-Assisted Benchmark for Multi-Category Integrated Retrieval

Mehmet Deniz T\"urkmen; Suchana Datta; Dwaipayan Roy; Daniel Hienert; Philipp Mayr; Derek Greene

arXiv:2605.11254·cs.IR·May 13, 2026

MIRA: An LLM-Assisted Benchmark for Multi-Category Integrated Retrieval

Mehmet Deniz T\"urkmen, Suchana Datta, Dwaipayan Roy, Daniel Hienert, Philipp Mayr, Derek Greene

PDF

TL;DR

MIRA is a new benchmark for multi-category integrated retrieval, using real user queries and LLMs to evaluate diverse scholarly data sources in a unified framework.

Contribution

It introduces a large-scale, realistic IR benchmark covering multiple scholarly categories, utilizing LLMs for relevance assessment to reduce costs.

Findings

01

Built on real user queries for realism

02

Covers four distinct scholarly categories

03

Uses LLMs for topic description and relevance assessment

Abstract

Users increasingly expect modern search systems to offer a unified interface that seamlessly retrieves information from diverse data sources and formats. However, current information retrieval (IR) evaluation benchmarks have not kept pace with this development, primarily due to the lack of test collections that represent the diversity of contemporary search domains. We address this critical gap with MIRA, a novel benchmark based on a large-scale social science search platform. MIRA is designed for category-aware ranking across heterogeneous categories - Publications, Research Data, Variables, and Instruments & Tools - within a single, unified evaluation framework. The proposed collection is distinctive in several ways: (1) it is built upon real user queries, providing a more realistic basis for evaluation; (2) it covers scholarly items from four distinct categories, enabling…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.