Evaluation of recommender systems in streaming environments
Jo\~ao Vinagre, Al\'ipio M\'ario Jorge, Jo\~ao Gama

TL;DR
This paper introduces a prequential evaluation protocol for recommender systems that effectively assesses algorithm performance over time in streaming data environments, addressing limitations of traditional offline evaluation methods.
Contribution
The paper proposes a novel prequential evaluation protocol tailored for streaming recommender systems, enabling dynamic monitoring and reliable comparison of algorithms over time.
Findings
Prequential evaluation effectively tracks accuracy evolution in streaming environments.
The protocol allows significance testing over sliding windows for algorithm comparison.
It detects phenomena unnoticed by traditional offline evaluations.
Abstract
Evaluation of recommender systems is typically done with finite datasets. This means that conventional evaluation methodologies are only applicable in offline experiments, where data and models are stationary. However, in real world systems, user feedback is continuously generated, at unpredictable rates. Given this setting, one important issue is how to evaluate algorithms in such a streaming data environment. In this paper we propose a prequential evaluation protocol for recommender systems, suitable for streaming data environments, but also applicable in stationary settings. Using this protocol we are able to monitor the evolution of algorithms' accuracy over time. Furthermore, we are able to perform reliable comparative assessments of algorithms by computing significance tests over a sliding window. We argue that besides being suitable for streaming data, prequential evaluation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
