MERGE: Next-Generation Item Indexing Paradigm for Large-Scale Streaming Recommendation
Jing Yan, Yimeng Bai, Zongyu Liu, Yahui Liu, Junwei Wang, Jingze Huang, Haoda Li, Sihao Ding, Shaohui Ruan, Yang Zhang

TL;DR
MERGE introduces an adaptive, hierarchical item indexing method that significantly enhances clustering accuracy and recommendation performance in large-scale streaming systems by addressing distribution skewness and non-stationarity.
Contribution
It presents MERGE, a novel item indexing paradigm that dynamically constructs and merges clusters for improved accuracy and efficiency in streaming recommender systems.
Findings
Improves assignment accuracy over existing methods
Enhances cluster uniformity and separation
Leads to significant business metric gains in online tests
Abstract
Item indexing, which maps a large corpus of items into compact discrete representations, is critical for both discriminative and generative recommender systems, yet existing Vector Quantization (VQ)-based approaches struggle with the highly skewed and non-stationary item distributions common in streaming industry recommenders, leading to poor assignment accuracy, imbalanced cluster occupancy, and insufficient cluster separation. To address these challenges, we propose MERGE, a next-generation item indexing paradigm that adaptively constructs clusters from scratch, dynamically monitors cluster occupancy, and forms hierarchical index structures via fine-to-coarse merging. Extensive experiments demonstrate that MERGE significantly improves assignment accuracy, cluster uniformity, and cluster separation compared with existing indexing methods, while online A/B tests show substantial gains…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Information Retrieval and Search Behavior · Expert finding and Q&A systems
