Robust Training Objectives Improve Embedding-based Retrieval in Industrial Recommendation Systems
Matthew Kolodner, Mingxuan Ju, Zihao Fan, Tong Zhao, Elham Ghazizadeh,, Yan Wu, Neil Shah, Yozen Liu

TL;DR
This paper investigates the effectiveness of self-supervised multitask learning (SSMTL) as a robust training objective for embedding-based retrieval in large-scale industrial recommendation systems, demonstrating significant improvements in user engagement metrics.
Contribution
It validates the application of SSMTL in industrial RS at scale, addressing data augmentation costs and task alignment issues, and shows its positive impact through online A/B testing.
Findings
Up to 5.45% increase in new friends made
1.91% improvement for cold-start users
Statistically significant performance gains in production
Abstract
Improving recommendation systems (RS) can greatly enhance the user experience across many domains, such as social media. Many RS utilize embedding-based retrieval (EBR) approaches to retrieve candidates for recommendation. In an EBR system, the embedding quality is key. According to recent literature, self-supervised multitask learning (SSMTL) has showed strong performance on academic benchmarks in embedding learning and resulted in an overall improvement in multiple downstream tasks, demonstrating a larger resilience to the adverse conditions between each downstream task and thereby increased robustness and task generalization ability through the training objective. However, whether or not the success of SSMTL in academia as a robust training objectives translates to large-scale (i.e., over hundreds of million users and interactions in-between) industrial RS still requires…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Text and Document Classification Technologies · Image Retrieval and Classification Techniques
MethodsALIGN
