Modality Alignment Meets Federated Broadcasting
Yuting Ma, Shengeng Tang, Xiaohua Xu, Lechao Cheng

TL;DR
This paper proposes a federated learning framework using modality alignment inspired by multi-modal models like CLIP, which improves model performance and robustness across heterogeneous data distributions by aligning cross-client learning through a server-client communication scheme.
Contribution
It introduces a novel FL approach with modality alignment, leveraging pre-trained models and low-rank adaptation to enhance performance in heterogeneous data scenarios.
Findings
Improves model generalization in heterogeneous FL settings
Maintains robustness across diverse data distributions
Enhances communication efficiency with parameter updates
Abstract
Federated learning (FL) has emerged as a powerful approach to safeguard data privacy by training models across distributed edge devices without centralizing local data. Despite advancements in homogeneous data scenarios, maintaining performance between the global and local clients in FL over heterogeneous data remains challenging due to data distribution variations that degrade model convergence and increase computational costs. This paper introduces a novel FL framework leveraging modality alignment, where a text encoder resides on the server, and image encoders operate on local devices. Inspired by multi-modal learning paradigms like CLIP, this design aligns cross-client learning by treating server-client communications akin to multi-modal broadcasting. We initialize with a pre-trained model to mitigate overfitting, updating select parameters through low-rank adaptation (LoRA) to meet…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimedia Communication and Technology · Algorithms and Data Compression · Digital Rights Management and Security
MethodsContrastive Language-Image Pre-training
