SynGR: Unleashing the Potential of Cross-Modal Synergy for Generative Recommendation
Wei Chen, Xingyu Guo, Shuang Li, Fuwei Zhang, Meng Yuan, Jing Fan, Zhao Zhang, Deqing Wang, Fuzhen Zhuang

TL;DR
SynGR introduces a framework that leverages cross-modal dependencies to improve generative recommendation by capturing emergent item semantics beyond individual modalities.
Contribution
It proposes a novel synergistic approach that explicitly exploits cross-modal dependencies, moving beyond alignment-centric fusion in generative recommendation models.
Findings
SynGR outperforms existing methods on three benchmark datasets.
Explicit cross-modal dependency exploitation enhances recommendation quality.
The framework captures emergent item semantics beyond individual modalities.
Abstract
Generative Recommendation (GR) has emerged as a promising paradigm by formulating item recommendation as a sequence-to-sequence generation task over item identifiers. Recent studies have incorporated multimodal signals to provide richer token-level evidence for generation. However, existing approaches largely rely on alignment-centric fusion and underexplore synergistic information across modalities. In practice, synergistic information plays a critical role in capturing emergent item properties that cannot be inferred from any single modality alone. Such properties encode intrinsic item semantics and guide user preferences, enabling models to move beyond surface-level feature matching. To address this limitation, we propose \textbf{SynGR}, a synergistic generative recommendation framework that explicitly encourages the exploitation of cross-modal dependencies during generation. By…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
