kNN-Embed: Locally Smoothed Embedding Mixtures For Multi-interest Candidate Retrieval
Ahmed El-Kishky, Thomas Markovich, Kenny Leung, Frank Portman, Aria, Haghighi, Ying Xiao

TL;DR
kNN-Embed enhances candidate retrieval in recommendation systems by modeling user interests as a smoothed mixture over learned clusters, significantly improving diversity and recall over standard methods.
Contribution
This paper introduces kNN-Embed, a novel method that models user interests as a mixture over learned clusters to improve diversity in dense ANN-based candidate retrieval.
Findings
Significant improvements in recall across three datasets.
Enhanced diversity in retrieved candidate sets.
Open-sourced a large Twitter follow-graph dataset.
Abstract
Candidate retrieval is the first stage in recommendation systems, where a light-weight system is used to retrieve potentially relevant items for an input user. These candidate items are then ranked and pruned in later stages of recommender systems using a more complex ranking model. As the top of the recommendation funnel, it is important to retrieve a high-recall candidate set to feed into downstream ranking models. A common approach is to leverage approximate nearest neighbor (ANN) search from a single dense query embedding; however, this approach this can yield a low-diversity result set with many near duplicates. As users often have multiple interests, candidate retrieval should ideally return a diverse set of candidates reflective of the user's multiple interests. To this end, we introduce kNN-Embed, a general approach to improving diversity in dense ANN-based retrieval. kNN-Embed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Recommender Systems and Techniques · Topic Modeling
