Diversity-Augmented Negative Sampling for Implicit Collaborative Filtering
Yueqing Xuan, Kacper Sokol, Mark Sanderson, Jeffrey Chan

TL;DR
This paper introduces a diversity-augmented negative sampling method for implicit collaborative filtering that enhances training data diversity, leading to improved recommendation performance without increasing computational costs.
Contribution
It proposes a novel negative sampling approach that combines hard negatives with diverse, representative negatives from a cache to produce more informative training data.
Findings
Improved recommendation accuracy across multiple datasets.
Enhanced diversity in negative samples without added computational complexity.
Consistent performance gains over existing sampling methods.
Abstract
Recommenders built upon implicit collaborative filtering are typically trained to distinguish between users' positive and negative preferences. When direct observations of the latter are unavailable, negative training data are constructed with sampling techniques. But since items often exhibit clustering in the latent space, existing methods tend to oversample negatives from dense regions, resulting in homogeneous training data and limited model expressiveness. To address these shortcomings, we propose a novel negative sampler with diversity guarantees. To achieve them, our approach first pairs each positive item of a user with one that they have not yet interacted with; this instance, called hard negative, is chosen as the top-scoring item according to the model. Instead of discarding the remaining highly informative items, we store them in a user-specific cache. Next, our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Indoor and Outdoor Localization Technologies
