Decoupling Common and Unique Representations for Multimodal   Self-supervised Learning

Yi Wang; Conrad M Albrecht; Nassim Ait Ali Braham; Chenying Liu,; Zhitong Xiong; Xiao Xiang Zhu

arXiv:2309.05300·cs.CV·July 22, 2024·1 cites

Decoupling Common and Unique Representations for Multimodal Self-supervised Learning

Yi Wang, Conrad M Albrecht, Nassim Ait Ali Braham, Chenying Liu,, Zhitong Xiong, Xiao Xiang Zhu

PDF

Open Access 2 Repos 1 Models

TL;DR

This paper introduces DeCUR, a novel method for multimodal self-supervised learning that effectively separates common and unique representations, improving integration and handling missing modalities across various sensor data scenarios.

Contribution

DeCUR is a new approach that decouples inter- and intra-modal embeddings, enhancing multimodal learning by capturing complementary information and addressing modality-specific features.

Findings

01

Consistent performance improvement across three multimodal scenarios.

02

Effective in both multimodal and modality-missing settings.

03

Applicable to various architectures.

Abstract

The increasing availability of multi-sensor data sparks wide interest in multimodal self-supervised learning. However, most existing approaches learn only common representations across modalities while ignoring intra-modal training and modality-unique representations. We propose Decoupling Common and Unique Representations (DeCUR), a simple yet effective method for multimodal self-supervised learning. By distinguishing inter- and intra-modal embeddings through multimodal redundancy reduction, DeCUR can integrate complementary information across different modalities. We evaluate DeCUR in three common multimodal scenarios (radar-optical, RGB-elevation, and RGB-depth), and demonstrate its consistent improvement regardless of architectures and for both multimodal and modality-missing settings. With thorough experiments and comprehensive analysis, we hope this work can provide valuable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

🤗
wangyi111/DeCUR
model· ♡ 3
♡ 3

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Optical Sensing Technologies · Meteorological Phenomena and Simulations · Target Tracking and Data Fusion in Sensor Networks