Multi-Probe Zero Collision Hash (MPZCH): Mitigating Embedding Collisions and Enhancing Model Freshness in Large-Scale Recommenders
Ziliang Zhao, Bi Xue, Emma Lin, Tianqi Lu, Mengjiao Zhou, Kaustubh Vartak, Shakhzod Ali-Zade, Tao Li, Bin Kuang, Rui Jian, Bin Wen, Dennis van der Staay, Yixin Bao, Eddy Li, Chao Deng, Henry Wei, Songbin Liu, Qifan Wang, and Kai Ren

TL;DR
The paper introduces MPZCH, a new hashing method for large-scale recommenders that eliminates embedding collisions, improves freshness, and maintains efficiency, with implementation in TorchRec.
Contribution
MPZCH is a novel linear probing based indexing mechanism that mitigates collisions and enhances embedding freshness in large-scale recommendation systems.
Findings
MPZCH achieves zero collisions for user embeddings.
It significantly improves item embedding freshness and quality.
Maintains comparable training and inference performance.
Abstract
Embedding tables are critical components of large-scale recommendation systems, facilitating the efficient mapping of high-cardinality categorical features into dense vector representations. However, as the volume of unique IDs expands, traditional hash-based indexing methods suffer from collisions that degrade model performance and personalization quality. We present Multi-Probe Zero Collision Hash (MPZCH), a novel indexing mechanism based on linear probing that effectively mitigates embedding collisions. With reasonable table sizing, it often eliminates these collisions entirely while maintaining production-scale efficiency. MPZCH utilizes auxiliary tensors and high-performance CUDA kernels to implement configurable probing and active eviction policies. By retiring obsolete IDs and resetting reassigned slots, MPZCH prevents the stale embedding inheritance typical of hash-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Advanced Graph Neural Networks · Topic Modeling
