Bridging Domain Gaps between Pretrained Multimodal Models and Recommendations
Wenyu Zhang, Jie Luo, Xinming Zhang, Yuan Fang

TL;DR
This paper introduces PTMRec, a parameter-efficient tuning framework that effectively bridges the domain gap between pre-trained multimodal models and recommendation systems, improving performance without costly retraining.
Contribution
It proposes a novel dual-stage tuning strategy that enhances multimodal recommendation by aligning pre-trained models with domain-specific data efficiently.
Findings
PTMRec outperforms baseline methods in recommendation tasks.
The framework reduces computational costs compared to full fine-tuning.
It effectively bridges domain gaps without additional pre-training.
Abstract
With the explosive growth of multimodal content online, pre-trained visual-language models have shown great potential for multimodal recommendation. However, while these models achieve decent performance when applied in a frozen manner, surprisingly, due to significant domain gaps (e.g., feature distribution discrepancy and task objective misalignment) between pre-training and personalized recommendation, adopting a joint training approach instead leads to performance worse than baseline. Existing approaches either rely on simple feature extraction or require computationally expensive full model fine-tuning, struggling to balance effectiveness and efficiency. To tackle these challenges, we propose \textbf{P}arameter-efficient \textbf{T}uning for \textbf{M}ultimodal \textbf{Rec}ommendation (\textbf{PTMRec}), a novel framework that bridges the domain gap between pre-trained models and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
