UniS-MMC: Multimodal Classification via Unimodality-supervised Multimodal Contrastive Learning
Heqing Zou, Meng Shen, Chen Chen, Yuchen Hu, Deepu Rajan, Eng Siong, Chng

TL;DR
This paper introduces UniS-MMC, a novel multimodal contrastive learning approach that leverages unimodal predictions to improve the quality of multimodal representations, outperforming existing methods on image-text classification benchmarks.
Contribution
The paper presents a unimodality-supervised contrastive learning framework that better captures inter-modality relationships by aligning unimodal representations with their more effective counterparts.
Findings
Outperforms state-of-the-art multimodal methods on UPMC-Food-101 and N24News benchmarks.
Demonstrates the effectiveness of unimodal supervision in multimodal contrastive learning.
Provides detailed ablation studies confirming the advantages of the proposed approach.
Abstract
Multimodal learning aims to imitate human beings to acquire complementary information from multiple modalities for various downstream tasks. However, traditional aggregation-based multimodal fusion methods ignore the inter-modality relationship, treat each modality equally, suffer sensor noise, and thus reduce multimodal learning performance. In this work, we propose a novel multimodal contrastive method to explore more reliable multimodal representations under the weak supervision of unimodal predicting. Specifically, we first capture task-related unimodal representations and the unimodal predictions from the introduced unimodal predicting task. Then the unimodal representations are aligned with the more effective one by the designed multimodal contrastive method under the supervision of the unimodal predictions. Experimental results with fused features on two image-text classification…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Chemical Sensor Technologies · Text and Document Classification Technologies
