Validating Simulations of User Query Variants
Timo Breuer, Norbert Fuhr, Philipp Schaer

TL;DR
This paper evaluates the realism of simulated user query variants in information retrieval, demonstrating their potential to mimic real user queries and proposing a new method that improves simulation quality based on TREC data.
Contribution
It introduces a simple, effective method for simulating user query variants that better reproduces real queries compared to existing approaches.
Findings
Simulated queries closely match real queries in retrieval effectiveness.
The new method improves query term similarity and topic score distribution reproduction.
Simulations are useful for system evaluation when real user data is unavailable.
Abstract
System-oriented IR evaluations are limited to rather abstract understandings of real user behavior. As a solution, simulating user interactions provides a cost-efficient way to support system-oriented experiments with more realistic directives when no interaction logs are available. While there are several user models for simulated clicks or result list interactions, very few attempts have been made towards query simulations, and it has not been investigated if these can reproduce properties of real queries. In this work, we validate simulated user query variants with the help of TREC test collections in reference to real user queries that were made for the corresponding topics. Besides, we introduce a simple yet effective method that gives better reproductions of real queries than the established methods. Our evaluation framework validates the simulations regarding the retrieval…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Recommender Systems and Techniques · Expert finding and Q&A systems
