DOS: Dual-Flow Orthogonal Semantic IDs for Recommendation in Meituan
Junwei Yin, Senjie Kou, Changhao Li, Shuli Wang, Xue Wei, Yinqiu Huang, Yinhua Zhu, Haitao Wang, Xingxing Wang

TL;DR
This paper introduces DOS, a dual-flow orthogonal semantic ID method that improves recommendation quality by aligning semantic codebooks with generation space and preserving semantics, demonstrated through extensive experiments and deployment.
Contribution
The paper proposes a novel dual-flow framework with orthogonal residual quantization to enhance semantic preservation and alignment in recommendation systems.
Findings
DOS outperforms existing methods in offline and online tests.
Successfully deployed in Meituan's app serving hundreds of millions of users.
Significantly improves recommendation accuracy and semantic fidelity.
Abstract
Semantic IDs serve as a key component in generative recommendation systems. They not only incorporate open-world knowledge from large language models (LLMs) but also compress the semantic space to reduce generation difficulty. However, existing methods suffer from two major limitations: (1) the lack of contextual awareness in generation tasks leads to a gap between the Semantic ID codebook space and the generation space, resulting in suboptimal recommendations; and (2) suboptimal quantization methods exacerbate semantic loss in LLMs. To address these issues, we propose Dual-Flow Orthogonal Semantic IDs (DOS) method. Specifically, DOS employs a user-item dual flow-framework that leverages collaborative signals to align the Semantic ID codebook space with the generation space. Furthermore, we introduce an orthogonal residual quantization scheme that rotates the semantic space to an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Topic Modeling · Machine Learning in Healthcare
