Evaluating Performance and Bias of Negative Sampling in Large-Scale   Sequential Recommendation Models

Arushi Prakash; Dimitrios Bermperidis; Srivas Chennu

arXiv:2410.17276·cs.IR·October 30, 2024

Evaluating Performance and Bias of Negative Sampling in Large-Scale Sequential Recommendation Models

Arushi Prakash, Dimitrios Bermperidis, Srivas Chennu

PDF

Open Access 1 Repo

TL;DR

This paper compares various negative sampling methods for large-scale sequential recommendation models, analyzing their impact on performance and bias across different dataset characteristics and popularity biases.

Contribution

It provides a comprehensive empirical evaluation of negative sampling techniques, highlighting their effects on model performance and bias in large-scale recommendation systems.

Findings

01

Random sampling reinforces popularity bias and favors head items.

02

Popularity-based methods offer more balanced performance across popularity bands.

03

Choice of negative sampling method significantly impacts model bias and effectiveness.

Abstract

Large-scale industrial recommendation models predict the most relevant items from catalogs containing millions or billions of options. To train these models efficiently, a small set of irrelevant items (negative samples) is selected from the vast catalog for each relevant item (positive example), helping the model distinguish between relevant and irrelevant items. Choosing the right negative sampling method is a common challenge. We address this by implementing and comparing various negative sampling methods - random, popularity-based, in-batch, mixed, adaptive, and adaptive with mixed variants - on modern sequential recommendation models. Our experiments, including hyperparameter optimization and 20x repeats on three benchmark datasets with varying popularity biases, show how the choice of method and dataset characteristics impact key model performance metrics. We also reveal that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

apple/ml-negative-sampling
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRecommender Systems and Techniques · Privacy-Preserving Technologies in Data · Human Mobility and Location-Based Analysis

MethodsSparse Evolutionary Training