SynGR: Unleashing the Potential of Cross-Modal Synergy for Generative Recommendation

Wei Chen; Xingyu Guo; Shuang Li; Fuwei Zhang; Meng Yuan; Jing Fan; Zhao Zhang; Deqing Wang; Fuzhen Zhuang

arXiv:2605.18920·cs.IR·May 20, 2026

SynGR: Unleashing the Potential of Cross-Modal Synergy for Generative Recommendation

Wei Chen, Xingyu Guo, Shuang Li, Fuwei Zhang, Meng Yuan, Jing Fan, Zhao Zhang, Deqing Wang, Fuzhen Zhuang

PDF

TL;DR

SynGR introduces a framework that leverages cross-modal dependencies to improve generative recommendation by capturing emergent item semantics beyond individual modalities.

Contribution

It proposes a novel synergistic approach that explicitly exploits cross-modal dependencies, moving beyond alignment-centric fusion in generative recommendation models.

Findings

01

SynGR outperforms existing methods on three benchmark datasets.

02

Explicit cross-modal dependency exploitation enhances recommendation quality.

03

The framework captures emergent item semantics beyond individual modalities.

Abstract

Generative Recommendation (GR) has emerged as a promising paradigm by formulating item recommendation as a sequence-to-sequence generation task over item identifiers. Recent studies have incorporated multimodal signals to provide richer token-level evidence for generation. However, existing approaches largely rely on alignment-centric fusion and underexplore synergistic information across modalities. In practice, synergistic information plays a critical role in capturing emergent item properties that cannot be inferred from any single modality alone. Such properties encode intrinsic item semantics and guide user preferences, enabling models to move beyond surface-level feature matching. To address this limitation, we propose \textbf{SynGR}, a synergistic generative recommendation framework that explicitly encourages the exploitation of cross-modal dependencies during generation. By…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.