Bridging Domain Gaps between Pretrained Multimodal Models and   Recommendations

Wenyu Zhang; Jie Luo; Xinming Zhang; Yuan Fang

arXiv:2502.15542·cs.IR·February 24, 2025

Bridging Domain Gaps between Pretrained Multimodal Models and Recommendations

Wenyu Zhang, Jie Luo, Xinming Zhang, Yuan Fang

PDF

TL;DR

This paper introduces PTMRec, a parameter-efficient tuning framework that effectively bridges the domain gap between pre-trained multimodal models and recommendation systems, improving performance without costly retraining.

Contribution

It proposes a novel dual-stage tuning strategy that enhances multimodal recommendation by aligning pre-trained models with domain-specific data efficiently.

Findings

01

PTMRec outperforms baseline methods in recommendation tasks.

02

The framework reduces computational costs compared to full fine-tuning.

03

It effectively bridges domain gaps without additional pre-training.

Abstract

With the explosive growth of multimodal content online, pre-trained visual-language models have shown great potential for multimodal recommendation. However, while these models achieve decent performance when applied in a frozen manner, surprisingly, due to significant domain gaps (e.g., feature distribution discrepancy and task objective misalignment) between pre-training and personalized recommendation, adopting a joint training approach instead leads to performance worse than baseline. Existing approaches either rely on simple feature extraction or require computationally expensive full model fine-tuning, struggling to balance effectiveness and efficiency. To tackle these challenges, we propose \textbf{P}arameter-efficient \textbf{T}uning for \textbf{M}ultimodal \textbf{Rec}ommendation (\textbf{PTMRec}), a novel framework that bridges the domain gap between pre-trained models and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.