Partitioner Guided Modal Learning Framework

Guimin Hu; Yi Xin; Lijie Hu; Zhihong Zhu; Hasti Seifi

arXiv:2507.11661·cs.CL·July 17, 2025

Partitioner Guided Modal Learning Framework

Guimin Hu, Yi Xin, Lijie Hu, Zhihong Zhu, Hasti Seifi

PDF

Open Access

TL;DR

This paper introduces PgM, a novel framework that effectively separates and learns uni-modal and paired-modal features in multimodal learning, improving flexibility and transferability across diverse tasks.

Contribution

The paper proposes a partitioner-guided framework that segments modal representations into uni-modal and paired-modal features, enabling more thorough and adaptable multimodal learning.

Findings

01

PgM improves performance across four multimodal tasks.

02

The framework enhances transferability to existing models.

03

Visualization reveals distinct contributions of features.

Abstract

Multimodal learning benefits from multiple modal information, and each learned modal representations can be divided into uni-modal that can be learned from uni-modal training and paired-modal features that can be learned from cross-modal interaction. Building on this perspective, we propose a partitioner-guided modal learning framework, PgM, which consists of the modal partitioner, uni-modal learner, paired-modal learner, and uni-paired modal decoder. Modal partitioner segments the learned modal representation into uni-modal and paired-modal features. Modal learner incorporates two dedicated components for uni-modal and paired-modal learning. Uni-paired modal decoder reconstructs modal representation based on uni-modal and paired-modal features. PgM offers three key benefits: 1) thorough learning of uni-modal and paired-modal features, 2) flexible distribution adjustment for uni-modal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies · Advanced Data Processing Techniques