Communication-Efficient and Robust Multi-Modal Federated Learning via Latent-Space Consensus

Mohamed Badi; Chaouki Ben Issaid; and Mehdi Bennis

arXiv:2603.19067·cs.LG·March 20, 2026·IEEE Wirel. Commun. Lett.

Communication-Efficient and Robust Multi-Modal Federated Learning via Latent-Space Consensus

Mohamed Badi, Chaouki Ben Issaid, and Mehdi Bennis

PDF

Open Access

TL;DR

This paper proposes CoMFed, a federated learning framework that efficiently and robustly trains multi-modal models by aligning compressed latent representations across clients, reducing communication costs and handling heterogeneity.

Contribution

It introduces learnable projections and a latent-space regularizer to improve multi-modal federated learning's efficiency and robustness, addressing heterogeneity and privacy concerns.

Findings

01

Achieves competitive accuracy on human activity recognition benchmarks.

02

Reduces communication overhead compared to traditional federated learning methods.

03

Enhances robustness to outliers and modality heterogeneity.

Abstract

Federated learning (FL) enables collaborative model training across distributed devices without sharing raw data, but applying FL to multi-modal settings introduces significant challenges. Clients typically possess heterogeneous modalities and model architectures, making it difficult to align feature spaces efficiently while preserving privacy and minimizing communication costs. To address this, we introduce CoMFed, a Communication-Efficient Multi-Modal Federated Learning framework that uses learnable projection matrices to generate compressed latent representations. A latent-space regularizer aligns these representations across clients, improving cross-modal consistency and robustness to outliers. Experiments on human activity recognition benchmarks show that CoMFed achieves competitive accuracy with minimal overhead.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Context-Aware Activity Recognition Systems · Domain Adaptation and Few-Shot Learning