Cross-Batch Negative Sampling for Training Two-Tower Recommenders
Jinpeng Wang, Jieming Zhu, Xiuqiang He

TL;DR
This paper introduces Cross-Batch Negative Sampling (CBNS), a novel strategy that leverages recent mini-batch embeddings to improve training efficiency and effectiveness of two-tower recommender models, reducing the need for large batch sizes.
Contribution
The paper proposes CBNS, a simple sampling method that enhances two-tower model training by utilizing recent mini-batch embeddings, addressing inefficiencies of large batch training.
Findings
CBNS improves training efficiency and model performance.
Theoretical analysis confirms the effectiveness of CBNS.
Empirical results show CBNS outperforms traditional in-batch negative sampling.
Abstract
The two-tower architecture has been widely applied for learning item and user representations, which is important for large-scale recommender systems. Many two-tower models are trained using various in-batch negative sampling strategies, where the effects of such strategies inherently rely on the size of mini-batches. However, training two-tower models with a large batch size is inefficient, as it demands a large volume of memory for item and user contents and consumes a lot of time for feature encoding. Interestingly, we find that neural encoders can output relatively stable features for the same input after warming up in the training process. Based on such facts, we propose a simple yet effective sampling strategy called Cross-Batch Negative Sampling (CBNS), which takes advantage of the encoded item embeddings from recent mini-batches to boost the model training. Both theoretical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Recommender Systems and Techniques · Machine Learning and ELM
