FusID: Modality-Fused Semantic IDs for Generative Music Recommendation
Haven Kim, Yupeng Hou, Julian McAuley

TL;DR
FusID introduces a novel modality-fused semantic ID framework for generative music recommendation, effectively capturing inter-modal interactions and reducing redundancy, leading to improved recommendation accuracy and zero ID conflicts.
Contribution
The paper proposes FusID, a new framework that fuses multiple modalities into unified representations and converts them into discrete tokens, addressing redundancy and inter-modal interaction limitations in existing systems.
Findings
Achieves zero ID conflicts in recommendations.
Outperforms baselines in MRR and Recall@k metrics.
Reduces codebook underutilization.
Abstract
Generative recommendation systems have achieved significant advances by leveraging semantic IDs to represent items. However, existing approaches that tokenize each modality independently face two critical limitations: (1) redundancy across modalities that reduces efficiency, and (2) failure to capture inter-modal interactions that limits item representation. We introduce FusID, a modality-fused semantic ID framework that addresses these limitations through three key components: (i) multimodal fusion that learns unified representations by jointly encoding information across modalities, (ii) representation learning that brings frequently co-occurring item embeddings closer while maintaining distinctiveness and preventing feature redundancy, and (iii) product quantization that converts the fused continuous embeddings into multiple discrete tokens to mitigate ID conflict. Evaluated on a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Recommender Systems and Techniques · Advanced Graph Neural Networks
