SID-Coord: Coordinating Semantic IDs for ID-based Ranking in Short-Video Search
Guowen Li, Yuepeng Zhang, Shunyu Zhang, Yi Zhang, Xiaoze Jiang, Yi Wang, Jingwei Zhuo

TL;DR
SID-Coord introduces a novel framework that integrates trainable semantic IDs into short-video ranking models, improving generalization for long-tailed items while maintaining memorization of frequent interactions.
Contribution
It proposes a unified semantic ID framework with attention-based fusion, adaptive gating, and interest alignment modules for better ID-based ranking.
Findings
Achieved +0.664% long-play rate improvement in search.
Increased search playback duration by +0.369%.
Effectively balances memorization and generalization in production.
Abstract
Large-scale short-video search ranking models are typically trained on sparse co-occurrence signals over hashed item identifiers (HIDs). While effective at memorizing frequent interactions, such ID-based models struggle to generalize to long-tailed items with limited exposure. This memorization-generalization trade-off remains a longstanding challenge in such industrial systems. We propose SID-Coord, a lightweight Semantic ID framework that incorporates discrete, trainable semantic IDs (SIDs) directly into ID-based ranking models. Instead of treating semantic signals as auxiliary dense features, SID-Coord represents semantics as structured identifiers and coordinates HID-based memorization with SID-based generalization within a unified modeling framework. To enable effective coordination, SID-Coord introduces three components: (1) an attention-based fusion module over hierarchical SIDs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
