TL;DR
This paper explores how temporal changes in document collections impact the evaluation of information retrieval systems, proposing a model to assess effectiveness over time and demonstrating significant effects of data changes on retrieval performance.
Contribution
It introduces a conceptual model extending Cranfield experiments to include temporal dynamics and evaluates how different types of data changes affect retrieval effectiveness.
Findings
Retrieval effectiveness varies significantly with different data change scenarios.
Both average and relative system performances are affected by the extent and type of data changes.
Proposed measures effectively describe changes in retrieval results over time.
Abstract
Information retrieval systems have been evaluated using the Cranfield paradigm for many years. This paradigm allows a systematic, fair, and reproducible evaluation of different retrieval methods in fixed experimental environments. However, real-world retrieval systems must cope with dynamic environments and temporal changes that affect the document collection, topical trends, and the individual user's perception of what is considered relevant. Yet, the temporal dimension in IR evaluations is still understudied. To this end, this work investigates how the temporal generalizability of effectiveness evaluations can be assessed. As a conceptual model, we generalize Cranfield-type experiments to the temporal context by classifying the change in the essential components according to the create, update, and delete operations of persistent storage known from CRUD. From the different types of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
