Multimodal Federated Learning via Contrastive Representation Ensemble

Qiying Yu; Yang Liu; Yimu Wang; Ke Xu; Jingjing Liu

arXiv:2302.08888·cs.LG·May 9, 2023·35 cites

Multimodal Federated Learning via Contrastive Representation Ensemble

Qiying Yu, Yang Liu, Yimu Wang, Ke Xu, Jingjing Liu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces CreamFL, a novel multimodal federated learning framework that enables heterogeneous client models to collaboratively learn richer representations without sharing raw data, improving performance on multimodal tasks.

Contribution

CreamFL allows training larger, heterogeneous models in federated settings using contrastive ensemble strategies, addressing modality and task gaps for better multimodal fusion.

Findings

01

Outperforms state-of-the-art FL methods on image-text retrieval.

02

Enhances multimodal representation through contrastive regularization.

03

Effective in tasks like visual question answering.

Abstract

With the increasing amount of multimedia data on modern mobile systems and IoT infrastructures, harnessing these rich multimodal data without breaching user privacy becomes a critical issue. Federated learning (FL) serves as a privacy-conscious alternative to centralized machine learning. However, existing FL methods extended to multimodal data all rely on model aggregation on single modality level, which restrains the server and clients to have identical model architecture for each modality. This limits the global model in terms of both model complexity and data capacity, not to mention task diversity. In this work, we propose Contrastive Representation Ensemble and Aggregation for Multimodal FL (CreamFL), a multimodal federated learning framework that enables training larger server models from clients with heterogeneous model architectures and data modalities, while only communicating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

flair-thu/creamfl
pytorchOfficial

Videos

Multimodal Federated Learning via Contrastive Representation Ensemble· slideslive

Taxonomy

TopicsPrivacy-Preserving Technologies in Data