Evaluating Temporal Persistence Using Replicability Measures
J\"uri Keller, Timo Breuer, Philipp Schaer

TL;DR
This paper investigates the temporal persistence of IR systems over time by submitting various retrieval models to longitudinal evaluation tasks and analyzing their replicability to understand how system performance changes.
Contribution
It introduces a method to assess IR system persistence through replicability measures and reports on the longitudinal evaluation of five advanced retrieval systems.
Findings
High potential in using replicability as an evaluation method
Quantified persistence of different IR systems over time
Demonstrated the importance of longitudinal evaluation in IR
Abstract
In real-world Information Retrieval (IR) experiments, the Evaluation Environment (EE) is exposed to constant change. Documents are added, removed, or updated, and the information need and the search behavior of users is evolving. Simultaneously, IR systems are expected to retain a consistent quality. The LongEval Lab seeks to investigate the longitudinal persistence of IR systems, and in this work, we describe our participation. We submitted runs of five advanced retrieval systems, namely a Reciprocal Rank Fusion (RRF) approach, ColBERT, monoT5, Doc2Query, and E5, to both sub-tasks. Further, we cast the longitudinal evaluation as a replicability study to better understand the temporal change observed. As a result, we quantify the persistence of the submitted runs and see great potential in this evaluation method.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Advanced Text Analysis Techniques · Complex Network Analysis Techniques
