On Sampling Top-K Recommendation Evaluation

Dong Li; Ruoming Jin; Jing Gao; Zhi Liu

arXiv:2106.10621·cs.IR·June 22, 2021

On Sampling Top-K Recommendation Evaluation

Dong Li, Ruoming Jin, Jing Gao, Zhi Liu

PDF

TL;DR

This paper investigates the relationship between sampling-based and global top-$K$ recommendation metrics, demonstrating that sampling metrics can reliably approximate and predict the true top-$K$ performance.

Contribution

It provides a theoretical and empirical analysis showing sampling top-$k$ metrics accurately reflect global top-$K$ metrics and can predict recommendation winners.

Findings

01

Sampling top-$k$ Hit-Ratio closely approximates global top-$K$ Hit-Ratio.

02

Sampling metrics can reliably predict the best-performing recommendation algorithms.

03

Theoretical and experimental validation supports the use of sampling metrics in evaluation.

Abstract

Recently, Rendle has warned that the use of sampling-based top- $k$ metrics might not suffice. This throws a number of recent studies on deep learning-based recommendation algorithms, and classic non-deep-learning algorithms using such a metric, into jeopardy. In this work, we thoroughly investigate the relationship between the sampling and global top- $K$ Hit-Ratio (HR, or Recall), originally proposed by Koren[2] and extensively used by others. By formulating the problem of aligning sampling top- $k$ ( $S H R @ k$ ) and global top- $K$ ( $H R @ K$ ) Hit-Ratios through a mapping function $f$ , so that $S H R @ k \approx H R @ f (k)$ , we demonstrate both theoretically and experimentally that the sampling top- $k$ Hit-Ratio provides an accurate approximation of its global (exact) counterpart, and can consistently predict the correct winners (the same as indicate by their corresponding global Hit-Ratios).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.