DSEBench: A Test Collection for Explainable Dataset Search with Examples

Qing Shi; Jing He; Qiaosheng Chen; Gong Cheng

arXiv:2510.17228·cs.IR·December 11, 2025

DSEBench: A Test Collection for Explainable Dataset Search with Examples

Qing Shi, Jing He, Qiaosheng Chen, Gong Cheng

PDF

Open Access

TL;DR

This paper introduces DSEBench, a new test collection for evaluating explainable dataset search that combines keyword and dataset similarity queries, with annotations and baseline evaluations.

Contribution

It presents the first test collection supporting dataset and field-level evaluation for explainable dataset search, including annotations and baseline methods.

Findings

01

DSEBench enables comprehensive evaluation of dataset search methods.

02

Large language models can generate useful training annotations.

03

Baseline experiments demonstrate the effectiveness of various retrieval and explanation techniques.

Abstract

Dataset search is a well-established task in the Semantic Web and information retrieval research. Current approaches retrieve datasets either based on keyword queries or by identifying datasets similar to a given target dataset. These paradigms fail when the information need involves both keywords and target datasets. To address this gap, we investigate a generalized task, Dataset Search with Examples (DSE), and extend it to Explainable DSE (ExDSE), which further requires identifying relevant fields of the retrieved datasets. We construct DSEBench, the first test collection that provides high-quality dataset-level and field-level annotations to support the evaluation of DSE and ExDSE, respectively. In addition, we employ a large language model to generate extensive annotations for training purposes. We establish comprehensive baselines on DSEBench by adapting and evaluating a variety of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInformation Retrieval and Search Behavior · Topic Modeling · Semantic Web and Ontologies