Online Information Retrieval Evaluation using the STELLA Framework

Timo Breuer; Narges Tavakolpoursaleh; Johann Schaible; Daniel Hienert,; Philipp Schaer; Leyla Jael Castro

arXiv:2210.13202·cs.IR·October 25, 2022

Online Information Retrieval Evaluation using the STELLA Framework

Timo Breuer, Narges Tavakolpoursaleh, Johann Schaible, Daniel Hienert,, Philipp Schaer, Leyla Jael Castro

PDF

TL;DR

The paper introduces the STELLA framework, an infrastructure that enables large-scale A/B testing of academic search systems by integrating user interactions and log analysis in real-world settings.

Contribution

It presents a novel infrastructure that combines user data and experimental setups for continuous evaluation of IR systems in real environments.

Findings

01

Enables large-scale A/B experiments with real users

02

Integrates user interactions and log analysis for IR evaluation

03

Supports continuous, real-world system assessment

Abstract

Involving users in early phases of software development has become a common strategy as it enables developers to consider user needs from the beginning. Once a system is in production, new opportunities to observe, evaluate and learn from users emerge as more information becomes available. Gathering information from users to continuously evaluate their behavior is a common practice for commercial software, while the Cranfield paradigm remains the preferred option for Information Retrieval (IR) and recommendation systems in the academic world. Here we introduce the Infrastructures for Living Labs STELLA project which aims to create an evaluation infrastructure allowing experimental systems to run along production web-based academic search systems with real users. STELLA combines user interactions and log files analyses to enable large-scale A/B experiments for academic search.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.