Improving Collaborative Metric Learning with Efficient Negative Sampling

Viet-Anh Tran; Romain Hennequin; Jimena Royo-Letelier; Manuel; Moussallam

arXiv:1909.10912·cs.IR·September 25, 2019

Improving Collaborative Metric Learning with Efficient Negative Sampling

Viet-Anh Tran, Romain Hennequin, Jimena Royo-Letelier, Manuel, Moussallam

PDF

1 Repo

TL;DR

This paper introduces a two-stage negative sampling method that enhances collaborative metric learning by enabling effective training with smaller batches, improving accuracy and reducing bias in recommendation systems.

Contribution

It proposes a novel negative sampling strategy that significantly reduces batch size requirements for CML, making it more scalable and efficient.

Findings

01

Improved accuracy in recommendation tasks

02

Reduced popularity bias in learned representations

03

Effective training with smaller batch sizes

Abstract

Distance metric learning based on triplet loss has been applied with success in a wide range of applications such as face recognition, image retrieval, speaker change detection and recently recommendation with the CML model. However, as we show in this article, CML requires large batches to work reasonably well because of a too simplistic uniform negative sampling strategy for selecting triplets. Due to memory limitations, this makes it difficult to scale in high-dimensional scenarios. To alleviate this problem, we propose here a 2-stage negative sampling strategy which finds triplets that are highly informative for learning. Our strategy allows CML to work effectively in terms of accuracy and popularity bias, even when the batch size is an order of magnitude smaller than what would be needed with the default uniform sampling. We demonstrate the suitability of the proposed strategy for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

deezer/sigir2019-2stagesampling
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTriplet Loss