On (Normalised) Discounted Cumulative Gain as an Off-Policy Evaluation Metric for Top-$n$ Recommendation
Olivier Jeunen, Ivan Potapov, Aleksei Ustimenko

TL;DR
This paper critically examines the use of (normalized) Discounted Cumulative Gain (nDCG) as an offline metric for top-$n$ recommendation evaluation, revealing limitations and conditions for its reliability compared to online experiments.
Contribution
It formally analyzes the assumptions behind DCG as an unbiased estimator of online reward and demonstrates that normalizing DCG can lead to inconsistent rankings of methods.
Findings
Unnormalized DCG correlates well with online reward under certain assumptions.
Normalizing DCG can invert the ranking of methods, reducing its reliability.
Empirical analysis shows unnormalized DCG's practical utility in offline evaluation.
Abstract
Approaches to recommendation are typically evaluated in one of two ways: (1) via a (simulated) online experiment, often seen as the gold standard, or (2) via some offline evaluation procedure, where the goal is to approximate the outcome of an online experiment. Several offline evaluation metrics have been adopted in the literature, inspired by ranking metrics prevalent in the field of Information Retrieval. (Normalised) Discounted Cumulative Gain (nDCG) is one such metric that has seen widespread adoption in empirical studies, and higher (n)DCG values have been used to present new methods as the state-of-the-art in top- recommendation for many years. Our work takes a critical look at this approach, and investigates when we can expect such metrics to approximate the gold standard outcome of an online experiment. We formally present the assumptions that are necessary to consider DCG…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDecision-Making and Behavioral Economics · Economic and Environmental Valuation · Environmental Education and Sustainability
