Contrastive Multimodal Fusion with TupleInfoNCE

Yunze Liu; Qingnan Fan; Shanghang Zhang; Hao Dong; Thomas Funkhouser,; Li Yi

arXiv:2107.02575·cs.CV·July 7, 2021

Contrastive Multimodal Fusion with TupleInfoNCE

Yunze Liu, Qingnan Fan, Shanghang Zhang, Hao Dong, Thomas Funkhouser,, Li Yi

PDF

1 Repo

TL;DR

This paper introduces TupleInfoNCE, a novel contrastive learning method for multimodal data that enhances the learning of both shared and complementary information across modalities, improving downstream task performance.

Contribution

It proposes a new contrastive loss that considers both positive/negative tuples and composed negatives, with theoretical mutual information justification and an optimized sampling strategy.

Findings

01

Outperforms previous methods on three downstream tasks

02

Effectively captures shared and complementary multimodal information

03

Ensures weaker modalities are not ignored during learning

Abstract

This paper proposes a method for representation learning of multimodal data using contrastive losses. A traditional approach is to contrast different modalities to learn the information shared between them. However, that approach could fail to learn the complementary synergies between modalities that might be useful for downstream tasks. Another approach is to concatenate all the modalities into a tuple and then contrast positive and negative tuple correspondences. However, that approach could consider only the stronger modalities while ignoring the weaker ones. To address these issues, we propose a novel contrastive learning objective, TupleInfoNCE. It contrasts tuples based not only on positive and negative correspondences but also by composing new negative tuples using modalities describing different scenes. Training with these additional negatives encourages the learning model to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hoi4d/TupleInfoNCE
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsContrastive Learning