Validating Search Query Simulations: A Taxonomy of Measures
Andreas Konstantin Kruff, Nolwenn Bernard, Philipp Schaer

TL;DR
This paper reviews and categorizes measures for validating simulated user queries in information retrieval, empirically tests their relationships across datasets, and offers practical recommendations and tools for researchers.
Contribution
It introduces a comprehensive taxonomy of validation measures, empirically verifies their relationships, and provides guidelines and a library to improve simulation validation practices.
Findings
The taxonomy clarifies measure relationships across search scenarios.
Empirical analysis reveals measure effectiveness varies by context.
Recommendations help select appropriate validation measures.
Abstract
Assessing the validity of user simulators when used for the evaluation of information retrieval systems remains an open question, constraining their effective use and the reliability of simulation-based results. To address this issue, we conduct a comprehensive literature review with a particular focus on methods for the validation of simulated user queries with regard to real queries. Based on the review, we develop a taxonomy that structures the current landscape of available measures. We empirically corroborate the taxonomy by analyzing the relationships between the different measures applied to four different datasets representing diverse search scenarios. Finally, we provide concrete recommendations on which measures or combinations of measures should be considered when validating user simulation in different contexts. Furthermore, we release a dedicated library with the most…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Data Visualization and Analytics · Libraries and Information Services
