MMBind: Unleashing the Potential of Distributed and Heterogeneous Data   for Multimodal Learning in IoT

Xiaomin Ouyang; Jason Wu; Tomoyoshi Kimura; Yihan Lin; Gunjan Verma,; Tarek Abdelzaher; Mani Srivastava

arXiv:2411.12126·cs.LG·March 6, 2025

MMBind: Unleashing the Potential of Distributed and Heterogeneous Data for Multimodal Learning in IoT

Xiaomin Ouyang, Jason Wu, Tomoyoshi Kimura, Yihan Lin, Gunjan Verma,, Tarek Abdelzaher, Mani Srivastava

PDF

Open Access

TL;DR

MMBind introduces a novel method for multimodal learning in IoT that constructs pseudo-paired datasets and employs weighted contrastive learning to effectively utilize distributed, heterogeneous, and incomplete data sources.

Contribution

It proposes a new data binding and learning framework that enables multimodal models to be trained on distributed, heterogeneous, and incomplete IoT data.

Findings

01

Outperforms state-of-the-art methods on ten real-world datasets

02

Effectively handles data incompleteness and domain shifts

03

Enhances multimodal foundation model training in IoT

Abstract

Multimodal sensing systems are increasingly prevalent in various real-world applications. Most existing multimodal learning approaches heavily rely on training with a large amount of synchronized, complete multimodal data. However, such a setting is impractical in real-world IoT sensing applications where data is typically collected by distributed nodes with heterogeneous data modalities, and is also rarely labeled. In this paper, we propose MMBind, a new data binding approach for multimodal learning on distributed and heterogeneous IoT data. The key idea of MMBind is to construct a pseudo-paired multimodal dataset for model training by binding data from disparate sources and incomplete modalities through a sufficiently descriptive shared modality. We also propose a weighted contrastive learning approach to handle domain shifts among disparate data, coupled with an adaptive multimodal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems

MethodsContrastive Learning