Candidate Set Sampling for Evaluating Top-N Recommendation
Ngozi Ihemelandu, Michael D. Ekstrand

TL;DR
This paper investigates how candidate set selection strategies impact the evaluation of top-N recommender systems, especially regarding popularity bias and bias reduction through sampling, using simulation to compare estimated and true metrics.
Contribution
It introduces an analysis of candidate set selection strategies' interaction with popularity bias and evaluates sampling methods for less biased metric estimation.
Findings
Sampling candidate sets can reduce bias in metric estimates.
Candidate set size and composition significantly affect evaluation accuracy.
Simulation shows sampled sets approximate true metrics better than full candidate sets.
Abstract
The strategy for selecting candidate sets -- the set of items that the recommendation system is expected to rank for each user -- is an important decision in carrying out an offline top- recommender system evaluation. The set of candidates is composed of the union of the user's test items and an arbitrary number of non-relevant items that we refer to as decoys. Previous studies have aimed to understand the effect of different candidate set sizes and selection strategies on evaluation. In this paper, we extend this knowledge by studying the specific interaction of candidate set selection strategies with popularity bias, and use simulation to assess whether sampled candidate sets result in metric estimates that are less biased with respect to the true metric values under complete data that is typically unavailable in ordinary experiments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
