Dynamic Multimodal Fusion via Meta-Learning Towards Micro-Video   Recommendation

Han Liu; Yinwei Wei; Fan Liu; Wenjie Wang; Liqiang Nie; Tat-Seng Chua

arXiv:2501.07110·cs.CV·January 14, 2025

Dynamic Multimodal Fusion via Meta-Learning Towards Micro-Video Recommendation

Han Liu, Yinwei Wei, Fan Liu, Wenjie Wang, Liqiang Nie, Tat-Seng Chua

PDF

1 Repo

TL;DR

This paper introduces MetaMMF, a meta-learning framework for dynamic multimodal fusion in micro-video recommendation, significantly improving recommendation accuracy by customizing fusion parameters per video.

Contribution

The paper proposes a novel meta-learning-based approach for dynamic multimodal fusion, addressing the limitations of static fusion methods in micro-video recommendation.

Findings

01

MetaMMF outperforms state-of-the-art models on benchmark datasets.

02

MetaMMF achieves higher recommendation accuracy with efficient training.

03

Canonical polyadic decomposition enhances model efficiency without sacrificing performance.

Abstract

Multimodal information (e.g., visual, acoustic, and textual) has been widely used to enhance representation learning for micro-video recommendation. For integrating multimodal information into a joint representation of micro-video, multimodal fusion plays a vital role in the existing micro-video recommendation approaches. However, the static multimodal fusion used in previous studies is insufficient to model the various relationships among multimodal information of different micro-videos. In this paper, we develop a novel meta-learning-based multimodal fusion framework called Meta Multimodal Fusion (MetaMMF), which dynamically assigns parameters to the multimodal fusion function for each micro-video during its representation learning. Specifically, MetaMMF regards the multimodal fusion of each micro-video as an independent task. Based on the meta information extracted from the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hanliu95/metammf
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.