TL;DR
MMM4Rec is a novel multi-modal sequential recommendation framework that leverages state space duality for efficient transfer learning, achieving faster convergence and improved accuracy over existing methods.
Contribution
It introduces a new algebraic constraint mechanism and a dual-stage architecture combining cross-modal alignment and temporal fusion for enhanced transferability.
Findings
Achieves 10x faster convergence speed in transfer learning.
Outperforms existing models in multi-modal recommendation accuracy.
Demonstrates state-of-the-art performance on large-scale datasets.
Abstract
Sequential Recommendation (SR) models infer user preferences from interaction histories. While transferable Multi-modal SR models outperform traditional ID-based approaches, existing methods struggle with slow fine-tuning convergence due to complex optimization requirements and negative transfer effects. We propose MMM4Rec (Multi-Modal Mamba for Sequential Recommendation), a novel Multi-modal SR framework that incorporates a dedicated algebraic constraint mechanism for efficient transfer learning. By combining State Space Duality (SSD)'s temporal decay properties with a globally-aware temporal modeling design, our model dynamically prioritizes key modality information, overcoming limitations of Transformer-based approaches. The framework implements a constrained two-stage process: (1) sequence-level cross-modal alignment via shared projection matrices, followed by (2) temporal fusion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
